A Methodological Critique of the PISA Evaluations
DOI:
https://doi.org/10.7203/relieve.22.1.8806Keywords:
Evaluation, Meta-evaluation, Methodology of evaluation, PISAAbstract
This paper conducts a methodological evaluation of the PISA international evaluations, giving a critical analysis of their shortcomings and limitations. A methodological review or meta-evaluation has been carried out on the multiple PISA reports in an attempt to demonstrate the plausible validity of the inferences that PISA maintains given a series of methodological limitations such as: an inconsistent rationale, opaque sampling, unstable evaluative design, measuring instruments of questionable validity, opportunistic use of scores transformed by standardization, reverential confidence in statistical significance, an absence of substantively significant statistics centered on the magnitudes of effects, a problematic presentation of findings and questionable implications drawn from the findings for educational norms and practice. There is an onus on PISA to provide and demonstrate more methodological rigor in future technical reports and a consequent need to be show great caution lest unfounded inferences are drawn from their findings.References
Adams, R. J. (2003). Response to ‘Cautions on OECD's recent educational survey (PISA)’. Oxford Review of Education, 29(3), 377-389. doi: http://dx.doi.org/10.1080/0305498032000120319
Adams, R., Berezner, A. & Jakubowski, M. (2010). Analysis of PISA 2006 preferred items ranking using the percent correct method. Paris: OECD. Retrieved from http://www.oecd.org/pisa/pisaproducts/pisa2006/44919855.pdf
Adams, R. J., Wu, M. L. & Carstensen, C. H. (2007). Application of multivariate Rasch models in international large-scale educational assessments. In M. von Davier & C. H. Carstensen (Eds.), Multivariate and mixture distribution Rasch models (pp. 271–280). New York: Springer.
American Psychological Association (1994). Publication manual of the American Psychological Association (4th ed.). Washington, DC: APA.
Bank, V. (2012). On OECD policies and the pitfalls in economy-driven education: The case of Germany. Journal of Curriculum Studies, 44(2), 193-210. doi: http://dx.doi.org/10.1080/00220272.2011.639903
Berliner, D. C. (2011). The context for interpreting PISA results in the USA. Negativism, chauvinism, misunderstanding, and the potential to distort the educational systems of nations. In M. Pereira, H-G. Kotthoff & R. Cowen (Eds.), PISA under examination: Changing knowledge, changing tests, and changing schools (pp. 77-96). Rotterdam: Sense Publishers
Berliner, D. C. (2015). The many facets of PISA. Teachers College Record, 117(1), 20.
Bollen, K., Paxton., P. & Morishima, R. (2005). Assessing international evaluations. An example from USAID's democracy and governance program. American Journal of Evaluation, 26(2), 189-203. doi: http://dx.doi.org/10.1177/1098214005275640
Brown, G., Micklewright, J., Schnepf, S. V. & Waldmann, R. (2007). International surveys of educational achievement: How robust are the findings? Journal of the Royal Statistical Society Series A-Statistics in Society, 170(3), 623-646. doi: http://dx.doi.org/10.1111/j.1467-985X.2006.00439.x
Carnoy, M. & Rothstein, R. (2013). International tests show achievement gaps in all countries, with big gains for U.S. disadvantaged students. Economic Policy Institute: Washington, DC. Retrieved from http://www.epi.org/blog/international-tests-achievement-gaps-gains-american-students/
Couso, D. (2009). Y después de PISA, ¿qué? Propuestas para desarrollar la competencia científica en el aula de ciencias [And after PISA, what? Proposals to develop the scientific competence in the Science classroom]. Enseñanza de las Ciencias, especial issue, 3547-3550.
DeSeCo Project (2008). Definition and selection of competencies: Theoretical and conceptual foundations. Paris: OECD. Retrieved from http://www.deseco.admin.ch/
De Witte, K. & Kortelainen, M. (2013). What explains the performance of students in a heterogeneous environment? Conditional efficiency estimation with continuous and discrete environmental variables. Applied Economics, 45(17), 2401-2412. doi: http://dx.doi.org/10.1080/00036846.2012.665602
Dolin, J. & Krogh, L. B. (2010). The relevance and consequences of PISA science in a Danish context. International Journal of Science and Mathematics Education, 8(3), 565-592. doi: http://dx.doi.org/10.1007/s10763-010-9207-6
Drechsel, B., Carstensen, C. & Prenzel, M. (2011). The role of content and context in PISA interest scales: A study of the embedded interest items in the PISA 2006 science assessment. International Journal of Science Education, 33(1), 73-95. doi: http://dx.doi.org/10.1080/09500693.2011.518646
Ehmke, T., Drechsel, B. & Carstensen, C. H. (2008). Klassenwiederholen in PISA-I-Plus: Was lernen sitzenbleiber in mathematik. [Grade repetition in PISA-I-Plus: What do students who repeat a class learn in mathematics?]. Zeitschrift fur Erziehungswissenschaft, 11(3), 368-387. doi: http://dx.doi.org/10.1007/s11618-008-0033
Ercikan, K., Roth, W-M. & Asil, M. (2015). Cautions about inferences from international assessments: The case of PISA 2009. Teacher College Records, 117(1), 1-28.
Fernández-Cano, A. & Fernández-Guerrero, IM. (2009). Crítica y alternativas a la significación estadística en el contraste de hipótesis. Colección Cuadernos de Estadística, nº 37. Madrid: Arco Libros-La Muralla.
Glass, G. V. (1977). Integrating findings: The meta-analysis of research. Review of Research in Education, 5(1), 351-379.
Grisay, A., De Jong., Gebhardt, E., Berezner, A. & Halleux-Monseur, B. (2007). Translation equivalence across PISA countries. Journal of Applied Measurement, 8(3), 249-266
Gur, B. S., Celik, Z. & Ozoglu, M. (2012). Policy options for Turkey: A critique of the interpretation and utilization of PISA results in Turkey. Journal of Education Policy, 27(1), 1-21. doi: http://dx.doi.org/10.1080/02680939.2011.595509
Hanberger, A. (2014). What PISA intends to and can possibly achieve: A critical programme theory analysis. European Educational Research Journal, 13(2), 167-180. doi: http://dx.doi.org/10.2304/eerj.2014.13.2.167
Hartig, J. & Frey, A. (2012). Validity of a standard-based test for mathematical competencies. Relations with the competencies assessed in PISA and variance between schools and school tracks. Diagnostica, 58(1), 3-14. doi: http://dx.doi.org/10.1026/0012-1924/a000064
Hohensinn, C., Kubinger, K. D., Reif, M., Schleicher, E. & Khorramdel, L. (2011). Analysing item position effects due to test booklet design within large-scale assessment. Educational Research and Evaluation, 17(6), 497-509. doi: http://dx.doi.org/10.1080/13803611.2011.632668
IEA (2012). The International Association for the Evaluation of Educational Achievement. Retrieved from http://www.iea.nl/
Jerrim, J. (2011). England's" plummeting" PISA test scores between 2000 and 2009: Is the performance of our secondary school pupils really in relative decline (Nº. 11-09). London: Department of Quantitative Social Science-Institute of Education of University of London.
Judkins, D. R. (1990). Fay’s method of variance estimation. Journal of Official Statistics, 6, 223-239.
Kjaernsli, M. & Lie, S. (2011). Students' preference for science careers: International comparisons based on PISA 2006. International Journal of Science Education, 33(1), 121-144. doi: http://dx.doi.org/10.1080/09500693.2011.518642
Knipprath, H. (2010). What PISA tells us about the quality and inequality of Japanese Education in Mathematics and Science. International Journal of Science and Mathematics Education, 8(3), 389-408.
Kreiner, S. & Christensen, K. B. (2014). Analyses of model fit and robustness. A new look at the PISA scaling model underlying ranking of countries according to reading literacy. Psychometrika, 79(2), 210-231. doi: http://dx.doi.org/10.1007/s11336-013-9347-z
Kubinger, K. D., Hohensinn, C., Hofer, S, Khorramdel, L, Freborta, M, Holocher-Ertl, S., Reif, M. & Sonnleitner, P. (2011). Designing the test booklets for Rasch model calibration in a large-scale assessment with reference to numerous moderator variables and several ability dimensions. Educational Research and Evaluation, 17(6), 483-495. doi: http://dx.doi.org/10.1080/13803611.2011.632666
Lafourcade, P. (1971). Evaluación de los aprendizajes [Learning evaluation]. Buenos Aires: Kapelusz.
Lee, J. (2014). An attempt to reinterpret student learning outcomes: A cross-national comparative study. Peabody Journal of Education, 89(1), 106-122. doi: http://dx.doi.org/10.1080/0161956X.2014.862476
Lu, Y. & Bolt, D. M. (2015). Examining the attitude-achievement paradox in PISA using a multilevel multidimensional IRT model for extreme response style. Large-scale Assessments in Education, 3(2). doi: http://dx.doi.org/10.1186/s40536-015-0012-0
Lynn, R. & Mikk, J. (2009). Sex differences in reading achievement. Trames-Journal of the Humanities and Social Sciences, 13(1), 3-13. doi: http://dx.doi.org/10.3176/tr.2009.1.01
Meyer, H-D. & Benavot, A. (Eds.). (2013). PISA, power, and policy. The emergence of global educational governance. Providence, RI: Symposium Books.
Ministerio de Educación, Cultura y Deporte [MECD] (2010). PISA 2009. Programa para la Evaluación Internacional de los Alumnos. OCDE. Informe español [PISA 2009.The Spanish report]. Madrid: Instituto de Evaluación. Retrieved from http://www.educacion.gob.es/dctm/ministerio/horizontales/prensa/notas/2010/20101207-pisa2009-informe-espanol.pdf?documentId=0901e72b806ea35a
National Center for Education Statistics (2012). National Assessment of Educational Progress. Retrieved from http://nces.ed.gov/nationsreportcard/
Olsen, RV. & Lie, S. (2011). Profiles of students' interest in science issues around the world: Analysis of data from PISA 2006. International Journal of Science Education, 33(1), 97-120. doi: http://dx.doi.org/10.1080/09500693.2011.518638
Organisation for Economic Co-operation and Development (2009). PISA 2009 key findings. Paris: OECD Publishing. Retrieved from http://www.oecd.org/pisa/pisaproducts/pisa2009/pisa2009keyfindings.htm
Organisation for Economic Co-operation and Development (2012). PISA 2009 Technical report, PISA. Paris: OECD Publishing. doi: http://dx.doi.org/10.1787/9789264167872-en
Organisation for Economic Co-operation and Development (2013a). PISA 2012 Results in focus. What 15-year-olds know and what they can do with what they know. Retrieved from http://www.oecd.org/pisa/keyfindings/pisa-2012-results-overview.pdf
Organisation for Economic Co-operation and Development (2013b). PISA 2012 results. What make schools successful? Resources, policies and practices. Vol. 4. Paris: OECD. Retrieved from http://www.oecd.org/pisa/keyfindings/pisa-2012-results-volume-IV.pdf
Organisation for Economic Co-operation and Development (2014a). SPAIN – Country note –Results from PISA 2012 problem solving. Retrieved from http://www.oecd.org/spain/PISA-2012-PS-results-eng-SPAIN.pdf
Organisation for Economic Co-operation and Development (2014b). PISA 2012 technical report. Paris: OECD. Retrieved from http://www.oecd.org/pisa/pisaproducts/PISA-2012-technical-report-final.pdf
Organisation for Economic Co-operation and Development (2015a). School governance, assessments and accountability. Paris: OECD. Retrieved from http://www.oecd.org/pisa/keyfindings/Vol4Ch4.pdf
Organisation for Economic Co-operation and Development (2015b). PISA 2012 results. Paris: OECD. Retrieved from http://www.oecd.org/pisa/keyfindings/pisa-2012-results.htm
Prais, S. J. (2003). Cautions on OECD'S recent educational survey (PISA). Oxford Review of Education, 29(2), 139-163. doi: http://dx.doi.org/10.1080/0305498032000080657
Reilly, D. (2012). Gender, cu
lture, and sex-typed cognitive abilities. Plos One, 7(7). Retrieved from http://www.plosone.org/article/info%3Adoi%2F10.1371%2Fjournal.pone.0039904.doi: 10.1371/journal.pone.0039904
Rutkowski, L. (2014). Sensitivity of achievement estimation to conditioning model misclassification. Applied Measurement in Education, 27(2), 115-132. doi: http://dx.doi.org/10.1080/08957347.2014.880440
Rychen, D. S. & Salganik, L. H. (2003). Key competencies for successful life and a well-functioning society. Göttinga: Hogrefe & Huber.
Scriven, M. (2011). Evaluating evaluations:A meta/evaluation checklist. (6th ed.). Retrieved from http://michaelscriven.info/images/EvaluatingEvals-Checklist.pdf
Smith, E. (2009). Underachievement, failing youth and moral panics. Evaluation & Research in Education, 23(1), 37-49.
Strietholt, R., Rosén, M. & Bos, W. (2013). A correction model for differences in the sample compositions: the degree of comparability as a function of age and schooling. Large-scale Assessments in Education, 1(1). doi: http://dx.doi.org/10.1186/2196-0739-1-1
Stufflebeam, D. (2011). Meta-evaluation. Journal of MultiDisciplinary Evaluation, 7(15), 99-158.
Takayama, K. (2008). The politics of international league Tables: PISA in Japan's achievement crisis debate. Comparative Education, 44(4), 387-407. doi: http://dx.doi.org/10.1080/03050060802481413
Wang, M.C., Haertel, G.D. & Walberg, H.J. (1993). Toward a knowledge base for school learning. Review of Educational Research, 63(3), 249-294. doi: http://dx.doi.org/10.3102/00346543063003249
Wikipedia (2014). Informe PISA [PISA Report]. Retrieved from http://es.wikipedia.org/wiki/Informe_PISA
Yarbrough, D.B., Shulha, L.M., Hopson, R.K. & Caruthers, F.A. (2011). The program evaluation standards: A guide for evaluators and evaluation users (3rd ed.). Thousand Oaks, CA: Sage.
Yore, L. D., Anderson, J. O. & Chiu, M. H. (2010). Moving PISA results into the policy arena: Perspectives on knowledge transfer for future considerations and preparations. International Journal of Science and Mathematics Education, 8(3), 593-609.
Zuckerman, G. A., Kovaleva, G. S. & Kuznetsova, M. I. (2013). Between PIRLS and PISA: The advancement of reading literacy in a 10-15-year-old cohort. Learning and Individual Differences, 26, 64-73. doi: http://dx.doi.org/10.1016/j.lindif.2013.05.00 1
Downloads
Additional Files
Published
Issue
Section
License
The authors grant non-exclusive rights of exploitation of works published to RELIEVE and consent to be distributed under the Creative Commons Attribution-Noncommercial Use 4.0 International License (CC-BY-NC 4.0), which allows third parties to use the published material whenever the authorship of the work and the source of publication is mentioned, and it is used for non-commercial purposes.
The authors can reach other additional and independent contractual agreements, for the non-exclusive distribution of the version of the work published in this journal (for example, by including it in an institutional repository or publishing it in a book), as long as it is clearly stated that the Original source of publication is this magazine.
Authors are encouraged to disseminate their work after it has been published, through the internet (for example, in institutional archives online or on its website) which can generate interesting exchanges and increase work appointments.
The fact of sending your paper to RELIEVE implies that you accept these conditions.