Validity for Automatic Generation of Items for the Basic Competences Exam (Excoba)
DOI:
https://doi.org/10.7203/relieve.22.1.8048Keywords:
Automatic Item Generation, Educational Testing, Construct Validity, Factor Structure, Item AnalysisAbstract
Automatic Item Generation (AIG) is the process of designing and producing items for a test, as well as generating different versions of exams that are conceptually and statistically equivalent. Automatic Item Generation tools are developed with the assistance of information systems, which make these tools very efficient. Under this aim, GenerEx, an automatic item generation tool, was developed. GenerEx is used to automatically generate different versions of the Basic Competences Exam (Excoba). Even though AIG represents a great advance for the development of psychological and educational assessment, it is a methodological challenge to obtain evidence of validity of the enormous quantity of possible items and tests generated in an automatic process. This paper has the purpose of describing an approach to analyze the internal structure and the psychometric equivalence of exams generated by GenerEx and, additionally, to describe kinds of results obtained to reach this objective. The approach is based on the process for selecting samples from the generation tool, founded on the assumption that items and exams must be psychometrically equivalent. This work includes three kinds of conceptually different and complementary analysis: the Classical Test Theory, Item Response Theory and Confirmatory Factor Analysis. Results show that GenerEx produces psychometrically similar exams; however there are problems in some learning areas. The methodology was useful for obtaining a description about GenerEx’s psychometric functioning and the internal structure of two randomly generated versions of Excoba. Analysis can be complemented by a qualitative study of this item deficiencies.References
Backhoff, E. & Tirado, F. (1992). Desarrollo del Examen de Habilidades y Conocimientos Básicos. Revista de la Educación Superior, 21 (3), 95-118. Retrieved from http://www.metrica.edu.mx/fileadmin/user_upload/pdf/1992_Desarrollo_del_EXHCOBA.pdf
Backhoff, E., Ibarra, M. y Rosas, M. (1995). Sistema Computarizado de Exámenes (SICODEX). Revista Mexicana de Psicología, 12 (1), 55-62.
Bejar, I. I. (1993). A generative approach to psychological and educational measurement. En N. Frederikson, R. J. Mislevy & I. I. Bejar (Eds.). Test theory for a new generation of tests (pp. 323-359). Mahwah, NJ: Erlbaum.
Bejar, I. I. (2002). Generative testing: From conception to implementation. In S.H. Irvine & P.C. Kyllonen (Eds.), Item generation for test development (pp. 199-217). Mahwah, NY: Erlbaum.
Bentler, P. M. (2006). EQS 6 Structural Equations Program Manual. Encino, CA: Multivariate Software, Inc.
Embretson, S. E. (1999). Generating items during testing: psychometric issues and models. Psychometrika, 64 (4) 407-433. doi: http://dx.doi.org/10.1007/BF02294564
Ferreyra M. F. (2014). Metodología para analizar la estructura interna de un generador automático de reactivos (Tesis de doctorado no publicada). Universidad Autónoma de Baja California, Ensenada, Mexico.
Geerlings, H., Glass, C. A. W. & van der Linden, W. J. (2011). Modeling rule-based item generation. Psychometrika, 76 (2), 337-359. doi: http://dx.doi.org/10.1007/s11336-011-9204-x
Gierl, M. J. & Haladyna, T. M. (2012). Automatic item generation: an introduction. In M. J. Gierl & T. M. Haladyna (Eds.), Automatic item generation: Theory and practice (pp. 3-12). New York: Routledge.
Gierl, M. J. & Lai, H. (April, 2011). The Role of Item Models in Automatic Item Generation. Paper Presented at the Annual Meeting of the National Council on Measurement in Education. New Orleans, LA.
Gierl, M. J. & Lai, H. (2012). Using weak and theory to create item models for Automatic Item Generation: some practical guidelines with examples. In M. J. Gierl & T. M. Haladyna (Eds.). Automatic Item Generation: Theory and Practice. New York: Routledge.
Gierl, M. J., Zhou, J. & Alves, C. (2008). Developing a Taxonomy of Item Model Types to Promote Assessment Engineering. The Journal of Technology, Learning, and Assessment, (7) 2.
Glas, C. A. W. & van der Linden, W. J. (2003). Computarized adaptive Testing with item cloning. Applied Psychological Measurement, 27, 247-261. doi: http://dx.doi.org/10.1177/0146621603254291
Haladyna, T. M. (2012). Automatic item generation: A historical perspective. In M. J. Gierl & T. M. Haladyna (Eds.), Automatic item generation: Theory and practice (pp. 13-25). Nueva York: Routledge.
Haladyna, T. M. & Shindoll, R. R. (1989). Shells: A method for writing effective multiple-choice test items. Evaluation and the Health Professions, 12, 97-104. doi: http://dx.doi.org/10.1177/016327878901200106
Hively, W., Patterson, H. L. & Page, S. H. (1968). A “universe-defined” system for arithmetic achievement tests. Journal of Educational Measurement, 5, 275-290. doi: http://dx.doi.org/10.1111/j.1745-3984.1968.tb00639.x
Holling, H., Bertling, J. P. & Zeuch, N. (2009). Automatic item Generation for probability word problems. Studies in Educational Evaluation, 35, 71-76. doi: http://dx.doi.org/10.1016/j.stueduc.2009.10.004
Hombo, C. & Dresher, A. (2001). A simulation study of the impact of automatic item generation under NAEP-like data conditions. Paper presented at the annual meeting of the National Council on Measurement in Education, Seatle, Wa, EE. UU.
Linacre, J.M. (2010). Winsteps® (Version 3.70.0.2) [Computer Software]. Beaverton, Oregon: Winsteps.com
Masters, G. N. (1982). A Rasch model for partial credit scoring. Psychometrika, 47 (2), 149-174. doi: http://dx.doi.org/10.1007/BF02296272
Pérez-Morán, J. C. (2014). Análisis del aspecto sustantivo de la validez de constructo de una prueba de habilidades cuantitativas (Tesis de doctorado no publicada). Universidad Autónoma de Baja California, Ensenada, Mexico.
Rasch, G. (1961). On General Laws and the Meaning of Measurement in Psychology. Proceedings of the Fourth Berkeley Symposium on Mathematical Statistics and Probability, Volume 4: Contributions to Biology and Problems of Medicine, 321-333. University of California Press: Berkeley, CA. Retrieved from http://projecteuclid.org/euclid.bsmsp/1200512895
Sinharay, S. & Johnson, M. (2012). Statistical modeling of Automatic Item Generation. In M. J Gierl & T. M. Haladyna (Eds.). Automatic Item Generation: Theory and Practice. N. Y., New York: Routledge.
SPSS Inc. (2008). SPSS Statistics for Windows, Version 17.0. Chicago: SPSS Inc.
Downloads
Published
Issue
Section
License
The authors grant non-exclusive rights of exploitation of works published to RELIEVE and consent to be distributed under the Creative Commons Attribution-Noncommercial Use 4.0 International License (CC-BY-NC 4.0), which allows third parties to use the published material whenever the authorship of the work and the source of publication is mentioned, and it is used for non-commercial purposes.
The authors can reach other additional and independent contractual agreements, for the non-exclusive distribution of the version of the work published in this journal (for example, by including it in an institutional repository or publishing it in a book), as long as it is clearly stated that the Original source of publication is this magazine.
Authors are encouraged to disseminate their work after it has been published, through the internet (for example, in institutional archives online or on its website) which can generate interesting exchanges and increase work appointments.
The fact of sending your paper to RELIEVE implies that you accept these conditions.