Una crítica metodológica a las evaluaciones PISA
DOI:
https://doi.org/10.7203/relieve.22.1.8806Palabras clave:
Evaluación, Meta-evaluación, Metodología of evaluación, PISA.Resumen
En este trabajo realizamos una evaluación metodológica de las evaluaciones internacionales PISA, presentando un análisis crítico de sus deficiencias y limitaciones. Presentamos una revisión metodológica o meta-evaluación de los múltiples informes PISA, en un intento de demostrar la validez plausible de las inferencias que PISA mantiene, teniendo en cuenta una serie de limitaciones metodológicas tales como: una lógica incoherente, toma de muestras opacas, diseño evaluativo inestable, instrumentos de medición de validez cuestionables, el uso oportunista de las puntuaciones transformadas por la normalización, la confianza reverencial en la significación estadística, la ausencia de estadísticas sustantivamente importantes centradas en las magnitudes de los efectos, una presentación problemática de los hallazgos e implicaciones cuestionables extraídas de los resultados para las prácticas y las legislaciones educativas. Recae sobre PISA la responsabilidad de proporcionar y demostrar mayor rigor metodológico en los futuros informes técnicos y la consiguiente necesidad de ser cuidadosos para no mostrar inferencias sin fundamento a partir de sus hallazgos.
Citas
Adams, R. J. (2003). Response to ‘Cautions on OECD's recent educational survey (PISA)’. Oxford Review of Education, 29(3), 377-389. doi: http://dx.doi.org/10.1080/0305498032000120319
Adams, R., Berezner, A. & Jakubowski, M. (2010). Analysis of PISA 2006 preferred items ranking using the percent correct method. Paris: OECD. Retrieved from http://www.oecd.org/pisa/pisaproducts/pisa2006/44919855.pdf
Adams, R. J., Wu, M. L. & Carstensen, C. H. (2007). Application of multivariate Rasch models in international large-scale educational assessments. In M. von Davier & C. H. Carstensen (Eds.), Multivariate and mixture distribution Rasch models (pp. 271–280). New York: Springer.
American Psychological Association (1994). Publication manual of the American Psychological Association (4th ed.). Washington, DC: APA.
Bank, V. (2012). On OECD policies and the pitfalls in economy-driven education: The case of Germany. Journal of Curriculum Studies, 44(2), 193-210. doi: http://dx.doi.org/10.1080/00220272.2011.639903
Berliner, D. C. (2011). The context for interpreting PISA results in the USA. Negativism, chauvinism, misunderstanding, and the potential to distort the educational systems of nations. In M. Pereira, H-G. Kotthoff & R. Cowen (Eds.), PISA under examination: Changing knowledge, changing tests, and changing schools (pp. 77-96). Rotterdam: Sense Publishers
Berliner, D. C. (2015). The many facets of PISA. Teachers College Record, 117(1), 20.
Bollen, K., Paxton., P. & Morishima, R. (2005). Assessing international evaluations. An example from USAID's democracy and governance program. American Journal of Evaluation, 26(2), 189-203. doi: http://dx.doi.org/10.1177/1098214005275640
Brown, G., Micklewright, J., Schnepf, S. V. & Waldmann, R. (2007). International surveys of educational achievement: How robust are the findings? Journal of the Royal Statistical Society Series A-Statistics in Society, 170(3), 623-646. doi: http://dx.doi.org/10.1111/j.1467-985X.2006.00439.x
Carnoy, M. & Rothstein, R. (2013). International tests show achievement gaps in all countries, with big gains for U.S. disadvantaged students. Economic Policy Institute: Washington, DC. Retrieved from http://www.epi.org/blog/international-tests-achievement-gaps-gains-american-students/
Couso, D. (2009). Y después de PISA, ¿qué? Propuestas para desarrollar la competencia científica en el aula de ciencias [And after PISA, what? Proposals to develop the scientific competence in the Science classroom]. Enseñanza de las Ciencias, especial issue, 3547-3550.
DeSeCo Project (2008). Definition and selection of competencies: Theoretical and conceptual foundations. Paris: OECD. Retrieved from http://www.deseco.admin.ch/
De Witte, K. & Kortelainen, M. (2013). What explains the performance of students in a heterogeneous environment? Conditional efficiency estimation with continuous and discrete environmental variables. Applied Economics, 45(17), 2401-2412. doi: http://dx.doi.org/10.1080/00036846.2012.665602
Dolin, J. & Krogh, L. B. (2010). The relevance and consequences of PISA science in a Danish context. International Journal of Science and Mathematics Education, 8(3), 565-592. doi: http://dx.doi.org/10.1007/s10763-010-9207-6
Drechsel, B., Carstensen, C. & Prenzel, M. (2011). The role of content and context in PISA interest scales: A study of the embedded interest items in the PISA 2006 science assessment. International Journal of Science Education, 33(1), 73-95. doi: http://dx.doi.org/10.1080/09500693.2011.518646
Ehmke, T., Drechsel, B. & Carstensen, C. H. (2008). Klassenwiederholen in PISA-I-Plus: Was lernen sitzenbleiber in mathematik. [Grade repetition in PISA-I-Plus: What do students who repeat a class learn in mathematics?]. Zeitschrift fur Erziehungswissenschaft, 11(3), 368-387. doi: http://dx.doi.org/10.1007/s11618-008-0033
Ercikan, K., Roth, W-M. & Asil, M. (2015). Cautions about inferences from international assessments: The case of PISA 2009. Teacher College Records, 117(1), 1-28.
Fernández-Cano, A. & Fernández-Guerrero, IM. (2009). Crítica y alternativas a la significación estadística en el contraste de hipótesis. Colección Cuadernos de Estadística, nº 37. Madrid: Arco Libros-La Muralla.
Glass, G. V. (1977). Integrating findings: The meta-analysis of research. Review of Research in Education, 5(1), 351-379.
Grisay, A., De Jong., Gebhardt, E., Berezner, A. & Halleux-Monseur, B. (2007). Translation equivalence across PISA countries. Journal of Applied Measurement, 8(3), 249-266
Gur, B. S., Celik, Z. & Ozoglu, M. (2012). Policy options for Turkey: A critique of the interpretation and utilization of PISA results in Turkey. Journal of Education Policy, 27(1), 1-21. doi: http://dx.doi.org/10.1080/02680939.2011.595509
Hanberger, A. (2014). What PISA intends to and can possibly achieve: A critical programme theory analysis. European Educational Research Journal, 13(2), 167-180. doi: http://dx.doi.org/10.2304/eerj.2014.13.2.167
Hartig, J. & Frey, A. (2012). Validity of a standard-based test for mathematical competencies. Relations with the competencies assessed in PISA and variance between schools and school tracks. Diagnostica, 58(1), 3-14. doi: http://dx.doi.org/10.1026/0012-1924/a000064
Hohensinn, C., Kubinger, K. D., Reif, M., Schleicher, E. & Khorramdel, L. (2011). Analysing item position effects due to test booklet design within large-scale assessment. Educational Research and Evaluation, 17(6), 497-509. doi: http://dx.doi.org/10.1080/13803611.2011.632668
IEA (2012). The International Association for the Evaluation of Educational Achievement. Retrieved from http://www.iea.nl/
Jerrim, J. (2011). England's" plummeting" PISA test scores between 2000 and 2009: Is the performance of our secondary school pupils really in relative decline (Nº. 11-09). London: Department of Quantitative Social Science-Institute of Education of University of London.
Judkins, D. R. (1990). Fay’s method of variance estimation. Journal of Official Statistics, 6, 223-239.
Kjaernsli, M. & Lie, S. (2011). Students' preference for science careers: International comparisons based on PISA 2006. International Journal of Science Education, 33(1), 121-144. doi: http://dx.doi.org/10.1080/09500693.2011.518642
Knipprath, H. (2010). What PISA tells us about the quality and inequality of Japanese Education in Mathematics and Science. International Journal of Science and Mathematics Education, 8(3), 389-408.
Kreiner, S. & Christensen, K. B. (2014). Analyses of model fit and robustness. A new look at the PISA scaling model underlying ranking of countries according to reading literacy. Psychometrika, 79(2), 210-231. doi: http://dx.doi.org/10.1007/s11336-013-9347-z
Kubinger, K. D., Hohensinn, C., Hofer, S, Khorramdel, L, Freborta, M, Holocher-Ertl, S., Reif, M. & Sonnleitner, P. (2011). Designing the test booklets for Rasch model calibration in a large-scale assessment with reference to numerous moderator variables and several ability dimensions. Educational Research and Evaluation, 17(6), 483-495. doi: http://dx.doi.org/10.1080/13803611.2011.632666
Lafourcade, P. (1971). Evaluación de los aprendizajes [Learning evaluation]. Buenos Aires: Kapelusz.
Lee, J. (2014). An attempt to reinterpret student learning outcomes: A cross-national comparative study. Peabody Journal of Education, 89(1), 106-122. doi: http://dx.doi.org/10.1080/0161956X.2014.862476
Lu, Y. & Bolt, D. M. (2015). Examining the attitude-achievement paradox in PISA using a multilevel multidimensional IRT model for extreme response style. Large-scale Assessments in Education, 3(2). doi: http://dx.doi.org/10.1186/s40536-015-0012-0
Lynn, R. & Mikk, J. (2009). Sex differences in reading achievement. Trames-Journal of the Humanities and Social Sciences, 13(1), 3-13. doi: http://dx.doi.org/10.3176/tr.2009.1.01
Meyer, H-D. & Benavot, A. (Eds.). (2013). PISA, power, and policy. The emergence of global educational governance. Providence, RI: Symposium Books.
Ministerio de Educación, Cultura y Deporte [MECD] (2010). PISA 2009. Programa para la Evaluación Internacional de los Alumnos. OCDE. Informe español [PISA 2009.The Spanish report]. Madrid: Instituto de Evaluación. Retrieved from http://www.educacion.gob.es/dctm/ministerio/horizontales/prensa/notas/2010/20101207-pisa2009-informe-espanol.pdf?documentId=0901e72b806ea35a
National Center for Education Statistics (2012). National Assessment of Educational Progress. Retrieved from http://nces.ed.gov/nationsreportcard/
Olsen, RV. & Lie, S. (2011). Profiles of students' interest in science issues around the world: Analysis of data from PISA 2006. International Journal of Science Education, 33(1), 97-120. doi: http://dx.doi.org/10.1080/09500693.2011.518638
Organisation for Economic Co-operation and Development (2009). PISA 2009 key findings. Paris: OECD Publishing. Retrieved from http://www.oecd.org/pisa/pisaproducts/pisa2009/pisa2009keyfindings.htm
Organisation for Economic Co-operation and Development (2012). PISA 2009 Technical report, PISA. Paris: OECD Publishing. doi: http://dx.doi.org/10.1787/9789264167872-en
Organisation for Economic Co-operation and Development (2013a). PISA 2012 Results in focus. What 15-year-olds know and what they can do with what they know. Retrieved from http://www.oecd.org/pisa/keyfindings/pisa-2012-results-overview.pdf
Organisation for Economic Co-operation and Development (2013b). PISA 2012 results. What make schools successful? Resources, policies and practices. Vol. 4. Paris: OECD. Retrieved from http://www.oecd.org/pisa/keyfindings/pisa-2012-results-volume-IV.pdf
Organisation for Economic Co-operation and Development (2014a). SPAIN – Country note –Results from PISA 2012 problem solving. Retrieved from http://www.oecd.org/spain/PISA-2012-PS-results-eng-SPAIN.pdf
Organisation for Economic Co-operation and Development (2014b). PISA 2012 technical report. Paris: OECD. Retrieved from http://www.oecd.org/pisa/pisaproducts/PISA-2012-technical-report-final.pdf
Organisation for Economic Co-operation and Development (2015a). School governance, assessments and accountability. Paris: OECD. Retrieved from http://www.oecd.org/pisa/keyfindings/Vol4Ch4.pdf
Organisation for Economic Co-operation and Development (2015b). PISA 2012 results. Paris: OECD. Retrieved from http://www.oecd.org/pisa/keyfindings/pisa-2012-results.htm
Prais, S. J. (2003). Cautions on OECD'S recent educational survey (PISA). Oxford Review of Education, 29(2), 139-163. doi: http://dx.doi.org/10.1080/0305498032000080657
Reilly, D. (2012). Gender, cu
lture, and sex-typed cognitive abilities. Plos One, 7(7). Retrieved from http://www.plosone.org/article/info%3Adoi%2F10.1371%2Fjournal.pone.0039904.doi: 10.1371/journal.pone.0039904
Rutkowski, L. (2014). Sensitivity of achievement estimation to conditioning model misclassification. Applied Measurement in Education, 27(2), 115-132. doi: http://dx.doi.org/10.1080/08957347.2014.880440
Rychen, D. S. & Salganik, L. H. (2003). Key competencies for successful life and a well-functioning society. Göttinga: Hogrefe & Huber.
Scriven, M. (2011). Evaluating evaluations:A meta/evaluation checklist. (6th ed.). Retrieved from http://michaelscriven.info/images/EvaluatingEvals-Checklist.pdf
Smith, E. (2009). Underachievement, failing youth and moral panics. Evaluation & Research in Education, 23(1), 37-49.
Strietholt, R., Rosén, M. & Bos, W. (2013). A correction model for differences in the sample compositions: the degree of comparability as a function of age and schooling. Large-scale Assessments in Education, 1(1). doi: http://dx.doi.org/10.1186/2196-0739-1-1
Stufflebeam, D. (2011). Meta-evaluation. Journal of MultiDisciplinary Evaluation, 7(15), 99-158.
Takayama, K. (2008). The politics of international league Tables: PISA in Japan's achievement crisis debate. Comparative Education, 44(4), 387-407. doi: http://dx.doi.org/10.1080/03050060802481413
Wang, M.C., Haertel, G.D. & Walberg, H.J. (1993). Toward a knowledge base for school learning. Review of Educational Research, 63(3), 249-294. doi: http://dx.doi.org/10.3102/00346543063003249
Wikipedia (2014). Informe PISA [PISA Report]. Retrieved from http://es.wikipedia.org/wiki/Informe_PISA
Yarbrough, D.B., Shulha, L.M., Hopson, R.K. & Caruthers, F.A. (2011). The program evaluation standards: A guide for evaluators and evaluation users (3rd ed.). Thousand Oaks, CA: Sage.
Yore, L. D., Anderson, J. O. & Chiu, M. H. (2010). Moving PISA results into the policy arena: Perspectives on knowledge transfer for future considerations and preparations. International Journal of Science and Mathematics Education, 8(3), 593-609.
Zuckerman, G. A., Kovaleva, G. S. & Kuznetsova, M. I. (2013). Between PIRLS and PISA: The advancement of reading literacy in a 10-15-year-old cohort. Learning and Individual Differences, 26, 64-73. doi: http://dx.doi.org/10.1016/j.lindif.2013.05.00 1
Descargas
Archivos adicionales
Publicado
Número
Sección
Licencia
Los autores ceden de forma no exclusiva los derechos de explotación de los trabajos publicados a RELIEVE (a los solos efectos de favorecer la difusión de los artículos publicados:firmar contratos de difusión, de integración en bases de datos, etc.) y consienten que se distribuyan bajo la licencia de Creative Commons Reconocimiento-Uso No Comercial 4.0 International (CC-BY-NC 4.0), que permite a terceros el uso de lo publicado siempre que se mencione la autoría de la obra y la fuente de publicación, y se haga uso sin fines comerciales.
Los autores pueden llegar a otros acuerdos contractuales adicionales e independientes, para la distribución no exclusiva de la versión del trabajo publicado en esta revista (por ejemplo, incluyéndolo en un repositorio institucional o publicándolo en un libro), siempre y cuando se cite claramente que la fuente original de publicación es esta revista.
A los autores se les anima a difundir su trabajo después de publicado, a través de internet (por ejemplo, en archivos institucionales en línea o en su página web) lo que puede generar intercambios interesantes y aumentar las citas del trabajo.
La mera remisión del artículo a RELIEVE supone la aceptación de estas condiciones.