Assessment Capacity, Cultural Validity, and Consequential Validity in PISA
##plugins.pubIds.doi.readerDisplayName##:
https://doi.org/10.7203/relieve.22.1.8281关键词:
PISA, assessment capacity, cultural validity, consequential validity.摘要
International student assessments have played an increasing important role in educational policy. These international test comparisons generate valuable information about each participating country’s student performance and the social and contextual factors. A complex picture of the cultural, economic, and social factors that shape PISA participation is emerging. We aim to understand the relationship between national assessment capacity and how countries participate in international test comparisons. We propose a framework for examining assessment capacity as key to addressing two aspects of validity cultural and consequential. Also, we discuss the multiple facets of assessment capacity as conditions for addressing cultural validity and consequential validity in international test comparisons.
参考
Ad-Hoc Technical Committee on the Development of Technical Criteria for Examining Cultural Validity in Educational Assessment. (2015). Promoting and evaluating cultural validity in the activities performed by the National Institute for Educational Evaluation (INEE). Submitted to the National Institute for Educational Evaluation. Mexico City, Mexico, January 16.
Basterra, M. R. (2011). Cognition, culture, language, and assessment. In M. R. Basterra, E. Trumbull, & G. Solano-Flores (Eds.), Cultural validity in assessment (pp. 72-95). New York: Routledge.
Bialystok, E. (2002). Cognitive processes of L2 users. In V. J. Cook (Ed.), Portraits of the L2 user (pp. 145-165). Buffalo, NY: Multilingual Matters.
Bonnet, G. (2002). Reflections in a critical eye: On the pitfalls of international assessment. Assessment in Education: Principles, Policy & Practice, 9(3), 387-399.
Breakspear, S. (2012). The policy impact of PISA: An exploration of the normative effects of international benchmarking in school system performance. OECD Education Working Paper Number 71. Retrieved from OECD website: http://www.oecd-ilibrary.org/education/the-policy-impact-of-pisa_5k9fdfqffr28-en
Brennan, R. L. (2001). Generalizability theory. New York: Springer Verlag.
Camilli, G. (2006). Test fairness. In R. L. Brennan (Ed.), Educational measurement (4th ed.) (pp. 221-256). Westport, CT: American Council on Education and Praeger.
Camilli, G., & Shepard, L. (1994). Methods for identifying biased test items. Thousands Oaks, CA: Sage.
Capacity Development Group (2007, May). Capacity assessment methodology: User’s guide.. Bureau for Development Policy, United Nations Development Program. New York, September 2005. Retrieved from the United Nations Development Programme website: https://www.unpei.org/sites/default/files/PDF/institutioncapacity /UNDP-Capacity-Assessment-User-Guide.pdf
Carnoy, M. (2015). International test score comparisons and educational policy. Carnoy, M. (2015). International Test Score Comparisons and Educational Policy: A Review of the Critiques. Boulder, CO: National Education Policy Center. Retrieved from http://nepc.colorado.edu/publication/international-test-scores
Clarke, M. (2012). What matters most for student assessment systems: A framework paper. Retrieved from the World Bank website: https://openknowledge.worldbank.org/bitstream/handle/10986/17471/682350WP00PUBL0WP10READ0web04019012.pdf?sequence=1
Cronbach, L. J., Gleser, G. C., Nanda, H., & Rajaratnam, N. (1972). The dependability of behavioral measurements. New York: Wiley.
Darling-Hammond, Linda (2014). What can PISA tell us about U.S. education policy? New England Journal of Public Policy: 26(1), Article 4. Retrieved from http://scholarworks.umb.edu/nejpp/vol26/iss1/4
Dogan, E., & Circi, R. (2010). A blind item-review process as a method to investigate invalid moderators of item difficulty in translated assessment. In M. von Davier & D. Hastedt (Eds.), IERI Monograph Series: Issues and Methodologies in Large-Scale Assessments (Vol. 3) (pp. 157-172). Hamburg: IERI.
Ercikan, K., Roth, W.-M., & Asil, M. (2015). Cautions about uses of international assessments. Teachers College Record, 117(1), 1-28.
Ercikan, K., Roth, W.-M., Simon, M., Sandilands, D., & Lyons-Thomas, J. (2014). Inconsistencies in DIF detection for sub-groups in heterogeneous language groups. Applied Measurement in Education, 27, 275-285.
Ercikan, K., & Solano-Flores, G. (2016). Assessment and sociocultural context: A bidirectional relationship. In G. T. L. Brown & L. Harris (Eds.), Human Factors and Social Conditions of Assessment. New York: Routledge.
Ferrer, G. (2006). Educational assessment systems in Latin America: Current practice and future challenges. Washington, DC: PREAL. Retrieved from http://www.uis.unesco.org/Education/Documents/Ferrer.pdf
Figazzolo, L. (2009). Impact of PISA 2006 on the education policy debate. Retrieved from http://download.ei-ie.org/docs/IRISDocuments/Research%20Website%20Documents/2009-00036-01-E.pdf
Gebril, A. (2016). Educational assessment in Muslim countries: Values, policies, and practices. In G. Brown & L. Harris (Eds.), Handbook of human factors and social conditions of assessment . New York: Routledge.
Gilmore, A. (2005). The impact of PIRLS (2001) and TIMSS (2003) in low- and middle-income countries: An evaluation of the value of World Back support for international surveys of reading literacy (PIRLS) and mathematics and science (TIMSS). Retrieved from http://www.iea.nl/fileadmin/user_upload/Publications/Electronic_versions/Gilmore_Impact_PIRLS_TIMSS.pdf
Hamano, T. (2011).The globalization of student assessments and its impact on education policy [English version]. Proceedings, 13, 1-11. (Originally appeared in Japanese in 2008 in the Annual Bulletin of JASEP (Japan Academic Society for Educational Policy), 15, 21-37). Retrieved from http://teapot.lib.ocha.ac.jp/ocha/bitstream/10083/51418/1/Proceedings13_01Hamano.pdf
Hambleton, R.K. (2005). Issues, designs, and technical guidelines for adapting tests into multiple languages and cultures. In R.K. Hambleton, P.F. Merenda, & C.D. Spielberger (Eds.), Adapting educational and psychological tests for cross-cultural assessment. Mahwah, NJ: Lawrence Erlbaum.
Husén, T. (1983). An incurable academic: Memoirs of a professor. Oxford, UK: Pergamon Press.
Kamens, D. H., & McNeely, C. L. (2010). Globalization and the growth of international educational testing and national assessment. Comparative Education Review, 54(1), 5-25. doi: http://dx.doi.org/10.1086/648471
Kane, M. T. (1982). A sampling model of validity. Applied Psychological Measurement, 6, 125-160. doi: http://dx.doi.org/10.1177/014662168200600201
Kane, M. T. (2006). Validation. In R. L. Brennan (Ed.), Educational measurement (4th ed.) (pp. 17-64). Washington, DC: The National Council on Measurement in Education & the American Council on Education.
Kennedy, K. J. (2016). Exploring the influence of culture on assessment: The case of teachers’ conceptions of assessment in Confucian-heritage societies. In G. Brown & L. Harris (Eds.), Handbook of human factors and social conditions of assessment. New York: Routledge.
Lingard, B., & Lewis, S. (2016). Globalization of the American approach to accountability: The high price of testing. In G. Brown & L. Harris (Eds.), Handbook of human factors and social conditions of assessment. New York: Routledge.
Martínez-Rizo, F. (2015). Las pruebas ENLACE y EXCALE: Un estudio de validación [The ENLACE and EXCALE assessments: A validation study]. Retrieved from http://publicaciones.inee.edu.mx/buscadorPub/P1/C/148/P1C148.pdf
Messick, S. (1989). Validity. In R. L. Linn (Ed.), Educational Measurement (3rd ed.) (pp. 13-103). New York: American Council on Education, Macmillan.
Messick, S. (1995) Validity of psychological assessment: Validation of inferences from persons’ responses and performances as scientific inquiry into score meaning. American Psychologist, 50(9), 741-749. doi: http://dx.doi.org/10.1037/0003-066X.50.9.741
Mullis, I. V. S., & Martin, M. O. (2011). TIMSS 2011 item writing guidelines. Retrieved from http://timssandpirls.bc.edu/methods/pdf/T11_Item_writing_guidelines.pdf
Mullis, I. V. S., Martin, M. O., Ruddock, G. J., O’Sullivan, C. Y., & Preuschoff, C. (2009). TIMSS 2011: Assessment frameworks. Retrieved from http://timssandpirls.bc.edu/timss2011/downloads/TIMSS2011_Frameworks.pdf
National Project Managers’ Meeting (2010, October). Translation and adaptation guidelines for PISA 2012. Doc: NPM10104e. PISA Consortium. Budapest, Hungary. Retrieved from https://www.oecd.org/pisa/pisaproducts/49273486.pdf
Organisation for Economic Co-operation and Development (OECD). (n.d.). Programme for international student assessment (PISA): Results from PISA 2012, Country note: United States. Retrieved from http://www.oecd.org/pisa/keyfindings/PISA-2012-results-US.pdf
Organisation for Economic Co-operation and Development (2006). PISA released items: Mathematics. Retrieved http://www.oecd.org/pisa/38709418.pdf
Organisation for Economic Co-operation and Development (2010). Translation and adaptation guidelines for PISA 2012. Retrieved on from http://www.oecd.org/pisa/pisaproducts/49273486.pdf
Lockheed, M., Prokic-Bruer, T., & Shadrova, A. (2015). The experience of middle-income countries participating in PISA 2000-2015 (PISA series). Washington, D.C. & Paris: The World Bank & OECD Publishing. doi: http://dx.doi.org/10.1787 /9789264246195-en
Ravela, P. (Ed.). (2001). Los próximos pasos: ¿Hacia dónde y cómo avanzar en la evaluación de aprendizajes en América Latina? [The next steps: Where and how to advance the evaluation of learning in Latin America?] Document No. 20. Working Group on Assessment and Standards. Santiago: PREAL. Retrieved from http://campus-oei.org/calidad/grade.PDF
Shavelson, R. J., & Webb, N. M. (1991). Generalizability theory: A primer. Newbury Park, CA: Sage.
Shepard, L. A. (1997). The centrality of test use and consequences for test validity. Educational Measurement: Issues and Practice, 16(2), 5‐8, 13. doi: http://dx.doi.org/10.1111/j.1745-3992.1997.tb00585.x
Sjøberg, S. (2007). PISA and “real life challenges”: Mission impossible. In S. T. Hopmann, G. Brinek, & M. Retzl (Eds.), According to PISA—Does PISA keep what it promises? Berlin: LIT Verlag.
Solano-Flores, G. (2008, July). A conceptual framework for examining the assessment capacity of countries in an era of globalization, accountability, and international test comparisons. Paper given at the 6th Conference of the International Test Commission, Liverpool, UK.
Solano-Flores, G. (2011). Assessing the cultural validity of assessment practices: An introduction. In M. R. Basterra, E. Trumbull, and G. Solano-Flores, Cultural validity in assessment (pp. 3-21). New York: Routledge.
Solano-Flores, G. (2016). Generalizability. In L. E. Suter, D. Wyse, E. Smith, & N. Selwyn (Eds.), The BERA/SSAGE Handbook of Educational Research (chap. 47). London: Sage.
Solano-Flores, G., Contreras-Niño, L. A., & Backhoff, E. (2006). Traducción y adaptación de pruebas: Lecciones aprendidas y recomendaciones para países participantes en TIMSS, PISA y otras comparaciones internacionales [Test translations and adaptation: Lessons learned and recommendations for countries participating in TIMSS, PISA, and other international comparisons]. Revista Electrónica de Investigación Educativa (REDIE) [Electronic Journal of Educational Research], 8(2). Retrieved from http://redie.uabc.mx /redie/article/download/143/246
Solano-Flores, G., Backhoff, E., & Contreras-Niño, L.A. (2009). Theory of test translation error. International Journal of Testing, 9, 78-91.
Solano-Flores, G., Contreras-Niño, L.A., & Backhoff, E. (2013). The measurement of translation error in PISA-2006 items: An application of the theory of test translation error. In M. Prenzel, M. Kobarg, K. Schöps, & S. Rönnebeck (Eds.), Research in the context of the programme for international student assessment (pp. 71-85). Springer Verlag.
Solano-Flores, G., & Gustafson, M. (2013). Assessment of English language learners: A critical, probabilistic, systemic view. In M. Simon, K. Ercikan, & M. Rousseau (Eds.), Improving large scale assessment in education: Theory, issues, and practice (pp. 87-109). New York: Routledge.
Solano-Flores, G., & Li, M. (2006). The use of generalizability (G) theory in the testing of linguistic minorities. Educational Measurement: Issues and Practice, 25(1), 13-22.
Solano-Flores, G., & Li, M. (2009). Generalizability of cognitive interview-based measures across cultural groups. Educational Measurement: Issues and Practice, 28 (2), 9-18.
Solano-Flores, G., & Li, M. (2013). Generalizability theory and the fair and valid assessment of linguistic minorities. Educational Research and Evaluation, 19(2-3), 245-263. doi: http://dx.doi.org/10.1080/13803611.2013.767632
Solano-Flores, G., & Nelson-Barber, S. (2001). On the cultural validity of science assessments. Journal of Research in Science Teaching, 38(5), 533-573. doi: http://dx.doi.org/10.1002/tea.1018
Stachelek, A. J. (2010). Exploring motivational factors for educational reform: Do international comparisons dictate educational policy? Journal of Mathematics Education at Teachers College, 1, 52-55.
Suter, Larry E. (2000). Is student achievement immutable? Evidence from international studies on schooling and student achievement. Review of Educational Research, 70(4), 529-545. doi: http://dx.doi.org/10.3102/00346543070004529
Sweller, J. (1994). Cognitive load theory, learning difficulty, and instructional design. Learning and Instruction, 4, 295-312. doi: http://dx.doi.org/10.1016/0959-4752(94)90003-5
Tatto, M. T. (2006). Education reform and the global regulation of teachers’ education, development and work: A cross-cultural analysis. International Journal of Educational Research, 45, 231-241. doi: http://dx.doi.org/10.1016/j.ijer.2007.02.003
Teltemann, J., & Klieme, E. (2016). The impact of international testing projects on policy and practice. In G. Brown & L. Harris (Eds.), Handbook of human factors and social conditions of assessment (Chap. 21). New York: Routledge.
van de Vijver, F. J. R. (2016). Assessment in education in multicultural populations. In G. Brown & L. Harris (Eds.), Handbook of human factors and social conditions of assessment, (Chap. 25). New York: Routledge.
Vygotsky, L. S. (1978). Mind in society: The development of higher psychological processes. Cambridge, MA: Harvard University Press.
Wertsch, J. V. (1985). Vygotsky and the social formation of mind. Cambridge, MA: Harvard University Press.
Wuttke, J. (2007). Uncertainties and bias in PISA. In S. T. Hopmann, G. Brinek, and M. Retzl (Eds.), According to PISA – Does PISA keep what it promises? Berlin: LIT Verlag.
##submission.downloads##
已出版
期
栏目
##submission.license##
Los autores ceden de forma no exclusiva los derechos de explotación de los trabajos publicados a RELIEVE (a los solos efectos de favorecer la difusión de los artículos publicados:firmar contratos de difusión, de integración en bases de datos, etc.) y consienten que se distribuyan bajo la licencia de Creative Commons Reconocimiento-Uso No Comercial 4.0 International (CC-BY-NC 4.0), que permite a terceros el uso de lo publicado siempre que se mencione la autoría de la obra y la fuente de publicación, y se haga uso sin fines comerciales.
Los autores pueden llegar a otros acuerdos contractuales adicionales e independientes, para la distribución no exclusiva de la versión del trabajo publicado en esta revista (por ejemplo, incluyéndolo en un repositorio institucional o publicándolo en un libro), siempre y cuando se cite claramente que la fuente original de publicación es esta revista.
A los autores se les anima a difundir su trabajo después de publicado, a través de internet (por ejemplo, en archivos institucionales en línea o en su página web) lo que puede generar intercambios interesantes y aumentar las citas del trabajo.
La mera remisión del artículo a RELIEVE supone la aceptación de estas condiciones.