Assessment Capacity, Cultural Validity, and Consequential Validity in PISA

Authors

  • Guillermo Solano-Flores Stanford University
  • Tamara Milbourn University of Colorado Boulder

DOI:

https://doi.org/10.7203/relieve.22.1.8281

Keywords:

PISA, assessment capacity, cultural validity, consequential validity.

Abstract

International student assessments have played an increasing important role in educational policy. These international test comparisons generate valuable information about each participating country’s student performance and the social and contextual factors. A complex picture of the cultural, economic, and social factors that shape PISA participation is emerging. We aim to understand the relationship between national assessment capacity and how countries participate in international test comparisons. We propose a framework for examining assessment capacity as key to addressing two aspects of validity cultural and consequential. Also, we discuss the multiple facets of assessment capacity as conditions for addressing cultural validity and consequential validity in international test comparisons.

Author Biographies

Guillermo Solano-Flores, Stanford University

is the corresponding author for this article. He is professor of education at the Graduate School of Education, Stanford University, US. He specializes in educational assessment and the linguistic and cultural issues that are relevant to both international test comparisons and the testing of cultural and linguistic minorities. His contributions to the field of educational assessment include the theory of test translation error, the use of generalizability theory—a psychometric theory of measurement error—in the testing of linguistic minorities, the formalization of the concept of cultural validity, and the design of a methodology for designing and analyzing illustrations used in test items

Tamara Milbourn, University of Colorado Boulder

is a Ph.D. Candidate in Educational Foundations, Policy and Practice at the University of Colorado Boulder with a Master's Degree in Applied English Linguistics from the University of Wisconsin-Madison, US. Her work examines the experiences of international students on American campuses. She is interested in issues in education related to mono/multi-lingualism, with an emphasis on issues of equity as connected to language practices and academic norms. She has worked in Taiwan, Japan and Benin and is currently teaching educational policy and linguistics courses in the University of Colorado System.

References

Ad-Hoc Technical Committee on the Development of Technical Criteria for Examining Cultural Validity in Educational Assessment. (2015). Promoting and evaluating cultural validity in the activities performed by the National Institute for Educational Evaluation (INEE). Submitted to the National Institute for Educational Evaluation. Mexico City, Mexico, January 16.

Basterra, M. R. (2011). Cognition, culture, language, and assessment. In M. R. Basterra, E. Trumbull, & G. Solano-Flores (Eds.), Cultural validity in assessment (pp. 72-95). New York: Routledge.

Bialystok, E. (2002). Cognitive processes of L2 users. In V. J. Cook (Ed.), Portraits of the L2 user (pp. 145-165). Buffalo, NY: Multilingual Matters.

Bonnet, G. (2002). Reflections in a critical eye: On the pitfalls of international assessment. Assessment in Education: Principles, Policy & Practice, 9(3), 387-399.

Breakspear, S. (2012). The policy impact of PISA: An exploration of the normative effects of international benchmarking in school system performance. OECD Education Working Paper Number 71. Retrieved from OECD website: http://www.oecd-ilibrary.org/education/the-policy-impact-of-pisa_5k9fdfqffr28-en

Brennan, R. L. (2001). Generalizability theory. New York: Springer Verlag.

Camilli, G. (2006). Test fairness. In R. L. Brennan (Ed.), Educational measurement (4th ed.) (pp. 221-256). Westport, CT: American Council on Education and Praeger.

Camilli, G., & Shepard, L. (1994). Methods for identifying biased test items. Thousands Oaks, CA: Sage.

Capacity Development Group (2007, May). Capacity assessment methodology: User’s guide.. Bureau for Development Policy, United Nations Development Program. New York, September 2005. Retrieved from the United Nations Development Programme website: https://www.unpei.org/sites/default/files/PDF/institutioncapacity /UNDP-Capacity-Assessment-User-Guide.pdf

Carnoy, M. (2015). International test score comparisons and educational policy. Carnoy, M. (2015). International Test Score Comparisons and Educational Policy: A Review of the Critiques. Boulder, CO: National Education Policy Center. Retrieved from http://nepc.colorado.edu/publication/international-test-scores

Clarke, M. (2012). What matters most for student assessment systems: A framework paper. Retrieved from the World Bank website: https://openknowledge.worldbank.org/bitstream/handle/10986/17471/682350WP00PUBL0WP10READ0web04019012.pdf?sequence=1

Cronbach, L. J., Gleser, G. C., Nanda, H., & Rajaratnam, N. (1972). The dependability of behavioral measurements. New York: Wiley.

Darling-Hammond, Linda (2014). What can PISA tell us about U.S. education policy? New England Journal of Public Policy: 26(1), Article 4. Retrieved from http://scholarworks.umb.edu/nejpp/vol26/iss1/4

Dogan, E., & Circi, R. (2010). A blind item-review process as a method to investigate invalid moderators of item difficulty in translated assessment. In M. von Davier & D. Hastedt (Eds.), IERI Monograph Series: Issues and Methodologies in Large-Scale Assessments (Vol. 3) (pp. 157-172). Hamburg: IERI.

Ercikan, K., Roth, W.-M., & Asil, M. (2015). Cautions about uses of international assessments. Teachers College Record, 117(1), 1-28.

Ercikan, K., Roth, W.-M., Simon, M., Sandilands, D., & Lyons-Thomas, J. (2014). Inconsistencies in DIF detection for sub-groups in heterogeneous language groups. Applied Measurement in Education, 27, 275-285.

Ercikan, K., & Solano-Flores, G. (2016). Assessment and sociocultural context: A bidirectional relationship. In G. T. L. Brown & L. Harris (Eds.), Human Factors and Social Conditions of Assessment. New York: Routledge.

Ferrer, G. (2006). Educational assessment systems in Latin America: Current practice and future challenges. Washington, DC: PREAL. Retrieved from http://www.uis.unesco.org/Education/Documents/Ferrer.pdf

Figazzolo, L. (2009). Impact of PISA 2006 on the education policy debate. Retrieved from http://download.ei-ie.org/docs/IRISDocuments/Research%20Website%20Documents/2009-00036-01-E.pdf

Gebril, A. (2016). Educational assessment in Muslim countries: Values, policies, and practices. In G. Brown & L. Harris (Eds.), Handbook of human factors and social conditions of assessment . New York: Routledge.

Gilmore, A. (2005). The impact of PIRLS (2001) and TIMSS (2003) in low- and middle-income countries: An evaluation of the value of World Back support for international surveys of reading literacy (PIRLS) and mathematics and science (TIMSS). Retrieved from http://www.iea.nl/fileadmin/user_upload/Publications/Electronic_versions/Gilmore_Impact_PIRLS_TIMSS.pdf

Hamano, T. (2011).The globalization of student assessments and its impact on education policy [English version]. Proceedings, 13, 1-11. (Originally appeared in Japanese in 2008 in the Annual Bulletin of JASEP (Japan Academic Society for Educational Policy), 15, 21-37). Retrieved from http://teapot.lib.ocha.ac.jp/ocha/bitstream/10083/51418/1/Proceedings13_01Hamano.pdf

Hambleton, R.K. (2005). Issues, designs, and technical guidelines for adapting tests into multiple languages and cultures. In R.K. Hambleton, P.F. Merenda, & C.D. Spielberger (Eds.), Adapting educational and psychological tests for cross-cultural assessment. Mahwah, NJ: Lawrence Erlbaum.

Husén, T. (1983). An incurable academic: Memoirs of a professor. Oxford, UK: Pergamon Press.

Kamens, D. H., & McNeely, C. L. (2010). Globalization and the growth of international educational testing and national assessment. Comparative Education Review, 54(1), 5-25. doi: http://dx.doi.org/10.1086/648471

Kane, M. T. (1982). A sampling model of validity. Applied Psychological Measurement, 6, 125-160. doi: http://dx.doi.org/10.1177/014662168200600201

Kane, M. T. (2006). Validation. In R. L. Brennan (Ed.), Educational measurement (4th ed.) (pp. 17-64). Washington, DC: The National Council on Measurement in Education & the American Council on Education.

Kennedy, K. J. (2016). Exploring the influence of culture on assessment: The case of teachers’ conceptions of assessment in Confucian-heritage societies. In G. Brown & L. Harris (Eds.), Handbook of human factors and social conditions of assessment. New York: Routledge.

Lingard, B., & Lewis, S. (2016). Globalization of the American approach to accountability: The high price of testing. In G. Brown & L. Harris (Eds.), Handbook of human factors and social conditions of assessment. New York: Routledge.

Martínez-Rizo, F. (2015). Las pruebas ENLACE y EXCALE: Un estudio de validación [The ENLACE and EXCALE assessments: A validation study]. Retrieved from http://publicaciones.inee.edu.mx/buscadorPub/P1/C/148/P1C148.pdf

Messick, S. (1989). Validity. In R. L. Linn (Ed.), Educational Measurement (3rd ed.) (pp. 13-103). New York: American Council on Education, Macmillan.

Messick, S. (1995) Validity of psychological assessment: Validation of inferences from persons’ responses and performances as scientific inquiry into score meaning. American Psychologist, 50(9), 741-749. doi: http://dx.doi.org/10.1037/0003-066X.50.9.741

Mullis, I. V. S., & Martin, M. O. (2011). TIMSS 2011 item writing guidelines. Retrieved from http://timssandpirls.bc.edu/methods/pdf/T11_Item_writing_guidelines.pdf

Mullis, I. V. S., Martin, M. O., Ruddock, G. J., O’Sullivan, C. Y., & Preuschoff, C. (2009). TIMSS 2011: Assessment frameworks. Retrieved from http://timssandpirls.bc.edu/timss2011/downloads/TIMSS2011_Frameworks.pdf

National Project Managers’ Meeting (2010, October). Translation and adaptation guidelines for PISA 2012. Doc: NPM10104e. PISA Consortium. Budapest, Hungary. Retrieved from https://www.oecd.org/pisa/pisaproducts/49273486.pdf

Organisation for Economic Co-operation and Development (OECD). (n.d.). Programme for international student assessment (PISA): Results from PISA 2012, Country note: United States. Retrieved from http://www.oecd.org/pisa/keyfindings/PISA-2012-results-US.pdf

Organisation for Economic Co-operation and Development (2006). PISA released items: Mathematics. Retrieved http://www.oecd.org/pisa/38709418.pdf

Organisation for Economic Co-operation and Development (2010). Translation and adaptation guidelines for PISA 2012. Retrieved on from http://www.oecd.org/pisa/pisaproducts/49273486.pdf

Lockheed, M., Prokic-Bruer, T., & Shadrova, A. (2015). The experience of middle-income countries participating in PISA 2000-2015 (PISA series). Washington, D.C. & Paris: The World Bank & OECD Publishing. doi: http://dx.doi.org/10.1787 /9789264246195-en

Ravela, P. (Ed.). (2001). Los próximos pasos: ¿Hacia dónde y cómo avanzar en la evaluación de aprendizajes en América Latina? [The next steps: Where and how to advance the evaluation of learning in Latin America?] Document No. 20. Working Group on Assessment and Standards. Santiago: PREAL. Retrieved from http://campus-oei.org/calidad/grade.PDF

Shavelson, R. J., & Webb, N. M. (1991). Generalizability theory: A primer. Newbury Park, CA: Sage.

Shepard, L. A. (1997). The centrality of test use and consequences for test validity. Educational Measurement: Issues and Practice, 16(2), 5‐8, 13. doi: http://dx.doi.org/10.1111/j.1745-3992.1997.tb00585.x

Sjøberg, S. (2007). PISA and “real life challenges”: Mission impossible. In S. T. Hopmann, G. Brinek, & M. Retzl (Eds.), According to PISA—Does PISA keep what it promises? Berlin: LIT Verlag.

Solano-Flores, G. (2008, July). A conceptual framework for examining the assessment capacity of countries in an era of globalization, accountability, and international test comparisons. Paper given at the 6th Conference of the International Test Commission, Liverpool, UK.

Solano-Flores, G. (2011). Assessing the cultural validity of assessment practices: An introduction. In M. R. Basterra, E. Trumbull, and G. Solano-Flores, Cultural validity in assessment (pp. 3-21). New York: Routledge.

Solano-Flores, G. (2016). Generalizability. In L. E. Suter, D. Wyse, E. Smith, & N. Selwyn (Eds.), The BERA/SSAGE Handbook of Educational Research (chap. 47). London: Sage.

Solano-Flores, G., Contreras-Niño, L. A., & Backhoff, E. (2006). Traducción y adaptación de pruebas: Lecciones aprendidas y recomendaciones para países participantes en TIMSS, PISA y otras comparaciones internacionales [Test translations and adaptation: Lessons learned and recommendations for countries participating in TIMSS, PISA, and other international comparisons]. Revista Electrónica de Investigación Educativa (REDIE) [Electronic Journal of Educational Research], 8(2). Retrieved from http://redie.uabc.mx /redie/article/download/143/246

Solano-Flores, G., Backhoff, E., & Contreras-Niño, L.A. (2009). Theory of test translation error. International Journal of Testing, 9, 78-91.

Solano-Flores, G., Contreras-Niño, L.A., & Backhoff, E. (2013). The measurement of translation error in PISA-2006 items: An application of the theory of test translation error. In M. Prenzel, M. Kobarg, K. Schöps, & S. Rönnebeck (Eds.), Research in the context of the programme for international student assessment (pp. 71-85). Springer Verlag.

Solano-Flores, G., & Gustafson, M. (2013). Assessment of English language learners: A critical, probabilistic, systemic view. In M. Simon, K. Ercikan, & M. Rousseau (Eds.), Improving large scale assessment in education: Theory, issues, and practice (pp. 87-109). New York: Routledge.

Solano-Flores, G., & Li, M. (2006). The use of generalizability (G) theory in the testing of linguistic minorities. Educational Measurement: Issues and Practice, 25(1), 13-22.

Solano-Flores, G., & Li, M. (2009). Generalizability of cognitive interview-based measures across cultural groups. Educational Measurement: Issues and Practice, 28 (2), 9-18.

Solano-Flores, G., & Li, M. (2013). Generalizability theory and the fair and valid assessment of linguistic minorities. Educational Research and Evaluation, 19(2-3), 245-263. doi: http://dx.doi.org/10.1080/13803611.2013.767632

Solano-Flores, G., & Nelson-Barber, S. (2001). On the cultural validity of science assessments. Journal of Research in Science Teaching, 38(5), 533-573. doi: http://dx.doi.org/10.1002/tea.1018

Stachelek, A. J. (2010). Exploring motivational factors for educational reform: Do international comparisons dictate educational policy? Journal of Mathematics Education at Teachers College, 1, 52-55.

Suter, Larry E. (2000). Is student achievement immutable? Evidence from international studies on schooling and student achievement. Review of Educational Research, 70(4), 529-545. doi: http://dx.doi.org/10.3102/00346543070004529

Sweller, J. (1994). Cognitive load theory, learning difficulty, and instructional design. Learning and Instruction, 4, 295-312. doi: http://dx.doi.org/10.1016/0959-4752(94)90003-5

Tatto, M. T. (2006). Education reform and the global regulation of teachers’ education, development and work: A cross-cultural analysis. International Journal of Educational Research, 45, 231-241. doi: http://dx.doi.org/10.1016/j.ijer.2007.02.003

Teltemann, J., & Klieme, E. (2016). The impact of international testing projects on policy and practice. In G. Brown & L. Harris (Eds.), Handbook of human factors and social conditions of assessment (Chap. 21). New York: Routledge.

van de Vijver, F. J. R. (2016). Assessment in education in multicultural populations. In G. Brown & L. Harris (Eds.), Handbook of human factors and social conditions of assessment, (Chap. 25). New York: Routledge.

Vygotsky, L. S. (1978). Mind in society: The development of higher psychological processes. Cambridge, MA: Harvard University Press.

Wertsch, J. V. (1985). Vygotsky and the social formation of mind. Cambridge, MA: Harvard University Press.

Wuttke, J. (2007). Uncertainties and bias in PISA. In S. T. Hopmann, G. Brinek, and M. Retzl (Eds.), According to PISA – Does PISA keep what it promises? Berlin: LIT Verlag.

Published

2016-07-08

Issue

Section

Special Section