Aberrant Patterns Detection Methods
2006, Anales De Psicologia
Abstract
Resumen: La identificación de patrones de respuesta atípicos es de gran utilidad para la construcción de tests y de bancos de ítems con propiedades psicométricas así como para el análisis de validez de los mismos. En este trabajo de revisión se han recogido los más relevantes y novedosos métodos de ajuste de personas que se han elaborado dentro de cada uno de los
References (101)
- Angoff, W.H. (1982). Uses of difficulty and discrimination indices for de- tecting item bias. En R.A. Berk (Ed.), Handbook of methods for detecting item bias (pp. 96-116). Baltimore, MD: Johns Hopkins University Press.
- Baillie, R. y Tatsuoka, K.K. (1982). TCC3 [Computer program].
- Birenbaum, M. (1986). Effect of dissimulation motivation and anxiety on re- sponse pattern appropriateness measures. Applied Psychological Measure- ment, 10, 167-174.
- Bock, R.D. (1972). Estimating item parameters and latent ability when re- sponses are scored in two or more latent categories. Psychometrika, 37, 29-51.
- Bradlow, E.T. y Weiss, R.E. (2001). Outlier measures and norming methods for computerized adaptive tests. Journal of Educational and Behavioral Sta- tistics, 26, 85-104.
- Bradlow, E.T., Weiss, R.E. y Cho, M. (1998). Bayesian identification of out- liers in computerized adaptive tests. Journal of the American Statistical Asso- ciation, 93, 910-919.
- Cronbach, L.J., Gleser, G.C., Nanda, H. y Rajaratnam, N. (1972). The depend- ability of behavioral measurements: Theory of generalizability for scores and profiles. New York: Wiley.
- Donlon, T.F. y Fischer, F.E. (1968). An index of an individual's agreement with group-determined item difficulties. Educational and Psychological Measurement, 28, 105-113.
- Doval, E., Núñez, M.I., Renom, J. y Solanas, A. (2001). X-PAT: Un explora-
- Drasgow, F. (1982). Choice of test models for appropriateness measure- ment. Applied Psychological Measurement, 6, 297-308.
- Drasgow, F. y Levine, M.V. (1986). Optimal detection of certain forms of inappropriate test scores. Applied Psychological Measurement, 10, 59-67.
- Drasgow, F., Levine, M.V. y McLaughlin, M.E. (1987). Detecting inappro- priate test scores with optimal and practical appropriateness indices. Applied Psychological Measurement, 11, 59-79.
- Drasgow, F., Levine, M.V. y McLaughlin, M.E. (1991). Appropriateness for some multidimensional test batteries. Applied Psychological Measurement, 15, 171-191.
- Drasgow, F., Levine, M.V. y Williams, E.A. (1985). Appropriateness meas- urement with polychotomous item response models and standardized indices. British Journal of Mathematical and Statistical Psychology, 38, 67-86.
- Drasgow, F., Levine, M.V. y Zickar, M.J. (1996). Optimal identification of mismeasured individuals. Applied Measurement in Education, 9, 47-64.
- Emons, W.H.M., Sijtsma, K. y Meijer, R.R. (2004). Testing hypotheses about the person-response function in person-fit analysis. Multivariate Behav- ioral Research, 39, 1-35.
- Emons, W.H.M., Glas, C.A.W., Meijer, R.R. y Sijtsma, K. (2003). Person fit in order-restricted latent class models. Applied Psychological Measurement, 27, 459-478.
- Frary, R.B., Tideman, T.N. y Watts, T.M. (1977). Indices of cheating on multiple-choice tests. Journal of Educational Statistics, 2, 235-256.
- Ferrando, P.J. (2004). Person reliability in personality measurement: an item response theory analysis. Applied Psychological Measurement, 28, 126-140.
- Ferrando, P.J. y Lorenzo, U. (2000). WPERFIT: A program for computing parametric person-fit statistics and plotting person response curves. Educational and Psychological Measurement, 60, 479-487.
- Glas, C.A.W. y Ellis, J. (1994). Computer programs: RSP. Rasch Measurement Transactions, 8, 339-340.
- Guttman L. (1944). A basis for scaling qualitative data. American Sociological Review, 9, 139-150.
- Guttman L. (1950). The basis for scalogram analysis. En S.A. Stouffer, L. Guttman, E.A. Suchman, P.F. Lazarsfeld, S.A. Star y J.A. Clausen (Eds.), Measurement and prediction. Studies in social psychology in World War II (Vol. 4) (pp. 60-90). Princeton, NJ: Princeton University Press.
- Harnisch, D.L. (1983). Item response patterns: Applications for educational practice. Journal of Educational Measurement, 20, 191-206.
- Harnisch, D.L. y Linn, R.L. (1981). Analysis of item response patterns: Questionable test data and dissimilar curriculum practices. Journal of Educational Measurement, 18, 133-146.
- Harnisch, D.L. y Tatsuoka, K.K. (1983). A comparison of appropriateness indices based on item response theory. En R.K. Hambleton (Ed.), Ap- plications of item response theory (pp. 104-122). Vancouver, Canada: Kluwer- Nijhoff Publishing.
- Hendrawan, I., Glas, C.A.W. y Meijer, R.R. (2005). The effect of person mis- fit on classification decisions. Applied Psychological Measurement, 29, 26-44.
- Kane, M.T. y Brennan, R.L. (1980). Agreement coefficients as indices of de- pendability for domain-referenced tests. Applied Psychological Measurement, 4, 105-126.
- Karabatsos, G. (2003). Comparing the aberrant response detection perform- ance of thirty-six person-fit statistics. Applied Measurement in Education, 16, 277-298.
- Klauer, K.C. (1995). The assessment of person fit. En G.H. Fischer y I.W. Molenaar (Eds.), Rasch models: Foundations, recent developments and applica- tions (pp. 97-110). New York: Springer-Verlag.
- Klauer, K.C. y Rettig, K. (1990). An approximately standardized person test for assessing consistency with a latent trait model. British Journal of Mathematical and Statistical Psychology, 43, 193-206.
- Kogut, J. (1988). Asymptotic distribution of a person-fit statistic (Research Report No. 88-13). Enschede, The Netherlands: University of Twente.
- Levine, M.V. y Drasgow, F. (1982). Appropriateness measurement: Review, critique and validating studies. British Journal of Mathematical and Statistical Psychology, 35, 42-56.
- Levine, M.V. y Drasgow, F. (1983a). Appropriateness measurement: Validat- ing studies and variable ability models. En D.J. Weiss (Ed.), New horizons in testing: Latent trait test theory and computerized adaptive testing (pp. 109-131). New York: Academic Press.
- Levine, M.V. y Drasgow, F. (1983b). The relation between incorrect option choice and estimated ability. Educational and Psychological Measurement, 43, 675-685.
- Levine, M.V. y Drasgow, F. (1984). Performance envelops and optimal appropriate- ness measurement (Report No. 84-5). Champaign, IL: University of Illi- nois, Department of Educational Psychology, Model-based Measure- ment Laboratory. (ERIC Document Reproduction Service No. ED 263 126).
- Levine, M.V. y Rubin, B.D. (1979). Measuring the appropriateness of multi- ple-choice test scores. Journal of Educational Statistics, 4, 269-290.
- Li, M.F. y Olejnik, S. (1997). The power of Rasch person-fit statistics in de- tecting unusual response patterns. Applied Psychological Measurement, 21, 215-231.
- Loevinger, J. (1947). A systematic approach to the construction and evalua- tion of tests of ability. Psychological Monograph, 61 (No. 4).
- Loevinger, J. (1948). The technique of homogeneous tests compared with some aspects of scale analysis and factor analysis. Psychological Bulletin, 45, 507-530.
- Lord, F.M. (1968). An analysis of the Verbal Scholastic Aptitude Test using Birnbaum's three-parameter logistic model. Educational and Psychological Measurement, 28, 989-1020.
- McLeod, L.D. y Lewis, C. (1999). Detecting item memorization in the CAT environment. Applied Psychological Measurement, 23, 147-160.
- McLeod, L.D., Lewis, C. y Thissen, D. (2003). A bayesian method for the detection of item preknowledge in computerized adaptive testing. Ap- plied Psychological Measurement, 27, 121-137.
- Meijer, R.R. (1994). The number of Guttman errors as a simple and power- ful person-fit statistic. Applied Psychological Measurement, 18, 311-314.
- Meijer, R.R. (1995). A supplement to "The number of Guttman errors as a simple and powerful person-fit statistic". Applied Psychological Measure- ment, 19, 166.
- Meijer, R.R. (2002). Outlier detection in hihg-stakes certification testing. Journal of Educational Measurement, 39, 219-233.
- Meijer, R.R. (2003). Diagnosing item score patterns on a test using item re- sponse theory-based person-fit statistics. Psychological Methods, 8, 72-87.
- Meijer, R.R., Muijtjens, M.M. y van der Vleuten, C.P.M. (1996). Nonpara- metric person-fit research: Some theoretical issues and an empirical ex- ample. Applied Measurement in Education, 9, 77-89.
- Meijer, R.R. y Nering, M.L. (1997). Trait level estimation for nonfitting re- sponse vectors. Applied Psychological Measurement, 21, 321-336.
- Meijer, R.R. y Sijtsma, K. (1999). A review of methods for evaluating the fit of item score patterns on a test (Research Report No. 99-01). Twente, The Nether- lands: University of Twente, Department of Educational Measurement and Data Analysis.
- Meijer, R.R. y Sijtsma, K. (2001). Methodology review: Evaluating person fit. Applied Psychological Measurement, 25, 107-135.
- Meijer, R.R., Sijtsma, K. y Smid, N.G. (1990). Theoretical and empirical comparison of the Mokken and the Rasch approach to IRT. Applied Psy- chological Measurement, 14, 283-298.
- Miller, M.D. (1986). Time allocation and patterns of item response. Journal of Educational Measurement, 23, 147-156.
- Mokken, R.J. (1971). A theory and procedure of scale analysis. The Netherlands: Mounton.
- Molenaar, I.W. y Hoijtink, H. (1990). The many null distributions of person fit indices. Psychometrika, 55, 75-106.
- Molenaar, I.W. y Hoijtink, H. (1996). Person-fit and the Rasch model, with an application to knowledge of logical quantors. Applied Measurement in Education, 9, 27-45.
- Nelson, R.B. y Chatman, S.P. (1985). RASCH/ECIZ: A SAS PROC MA- TRIX program for Rasch analysis and person-fit statistics. Applied Psy- chological Measurement, 9, 325.
- Nering, M.L. (1995). The distribution of person fit using true and estimated person parameters. Applied Psychological Measurement, 19, 121-129.
- Nering, M.L. (1997). The distribution of indexes of person fit within the computerized adaptive testing environment. Applied Psychological Meas- urement, 21, 115-127.
- Nering, M.L. y Meijer, R.R. (1998). A comparison of the person response function and the person-fit statistic. Applied Psychological Measurement, 22, 53-69.
- l Noonan, B.W., Boss, M.W. y Gessaroli, M.E. (1992). The effect of test length and IRT model on the distribution and stability of three appro- priateness indexes. Applied Psychological Measurement, 16, 345-352.
- Núñez, R.M. (2002). Propiedades distribucionales de un estadístico de medición apro- piada. Tesis doctoral no publicada. Universidad de Murcia.
- Page, E.S. (1954). Continuous inspection schemes. Biometrika, 41, 100-115.
- Parsons, C.K. (1983). The identification of people for whom job descriptive index scores are inappropriate. Organizational Behaviour and Human Per- formance, 33, 365-393.
- Reise, S.P. (1990). A comparison of item-and person-fit methods of assess- ing model-data fit in IRT. Applied Psychological Measurement, 14, 127-137.
- Reise, S.P. (1995). Scoring method and the detection of person misfit in a personality assessment context. Applied Psychological Measurement, 19, 213- 229.
- Reise, S.P. y Flannery, Wm. P. (1996). Assessing person-fit on measures of typical performance. Applied Measurement in Education, 9, 9-26.
- Reise, S.P. y Waller, N.G. (1993). Traitedness and the assessment of re- sponse pattern scalability. Journal of Personality and Social Psychology, 65, 143-151.
- Reise, S.P. y Widaman, K.F. (1999). Assessing the fit of measurement mod- els at the individual level: A comparison of item response theory and covariance structure approaches. Psychological Methods, 4, 3-21.
- Rogers, H.J. y Hattie, J.A. (1987). A Monte Carlo investigation of several person and item fit statistics for item response models. Applied Psycho- logical Measurement, 11, 47-57.
- Rosenbaum, P.R. (1987). Probability inequalities for latent scales. British Journal of Mathematical and Statistical Psychology, 40, 157-168.
- Rudner, L.M. (1983). Individual assessment accuracy. Journal of Educational Measurement, 20, 207-219.
- Rudner, L.M., Bracey, G. y Skaggs, G. (1996). The use of a person-fit statis- tic with one high-quality achievement test. Applied Measurement in Educa- tion, 9, 91-109.
- Sato, T. (1975). The construction and interpretation of S-P tables. Tokyo: Meiji To- sho. (En japonés).
- Schmitt, N., Chan, D., Sacco, J.M., McFarland, L.A. y Jennings, D. (1999). Correlates of person fit and effect of person fit on test validity. Applied Psychological Measurement, 23, 41-53.
- Sijtsma, K. (1986). A coefficient of deviance of response patterns. Kwantitati- eve Methoden, 7, 131-145.
- Sijtsma, K. y Meijer, R.R. (1992). A method for investigating the intersection of the item response functions in Mokken's nonparametric IRT model. Applied Psychological Measurement, 16, 149-157.
- Sijtsma, K. y Meijer, R.R. (2001). The person response function as a tool in person-fit research. Psychometrika, 66, 191-208.
- Smith, R.M. (1985). A comparison of Rasch person analysis and robust es- timators. Educational and Psychological Measurement, 45, 433-444.
- Smith, R.M. (1991). IPARM: Item and person analysis with the Rasch model. Chi- cago: MESA Press.
- Snijders, T.A.B. (2001). Asymptotic null distribution of person fit statistics with estimated person parameter. Psychometrika, 66, 331-342.
- Tatsuoka, K.K. (1984). Caution indices based on item response theory. Psy- chometrika, 49, 95-110.
- Tatsuoka, K.K. (1996). Use of generalized person-fit indexes, zetas for sta- tistical pattern classification. Applied Measurement in Education, 9, 65-75.
- Tatsuoka, K.K. y Linn, R.L. (1983). Indices for detecting unusual patterns: Links between two general approaches and potential applications. Ap- plied Psychological Measurement, 7, 81-96.
- Tatsuoka, K.K. y Tatsuoka, M.M. (1982). Detection of aberrant response patterns and their effect on dimensionality. Journal of Educational Statistics, 7, 215-231.
- Tatsuoka, K.K. y Tatsuoka, M.M. (1983). Spotting erroneous rules of opera- tion by the individual consistency index. Journal of Educational Measure- ment, 7, 215-231.
- Trabin, T.E. y Weiss, D.J. (1979). The person response curve: Fit of individuals to item characteristic curve models (Research Report No. 79-7). Minneapolis, MN: University of Minnesota, Department of Psychology, Psychomet- ric Methods Program.
- Trabin, T.E. y Weiss, D.J. (1983). The person response curve: Fit of indi- viduals to item response theory models. En D.J. Weiss (Ed.), New hori- zons in testing: Latent trait test theory and computerized adaptive testing (pp. 83- 108). New York: Academic Press.
- van der Flier, H. (1977). Environmental factors and deviant response pat- terns.
- En Y.H. Poortinga (Ed.), Basic problems in Cross-Cultural Psychology. Amsterdam: Swets and Zeitlinger.
- van der Flier, H. (1980). Vergelijkbaarheid van individuele testprestaties [Compara- bility of individual test performance].
- van der Flier, H. (1982). Deviant response patterns and comparability of test scores. Journal of Cross-Cultural Psychology, 13, 267-298.
- van Krimpen-Stoop, E.M.L.A. y Meijer, R.R. (1999). The null distribution of person-fit statistics for conventional and adaptive tests. Applied Psycho- logical Measurement, 23, 327-345.
- van Krimpen-Stoop, E.M.L.A. y Meijer, R.R. (2000). Detecting person- misfit in adaptive testing using statistical process control techniques. En W.J. van der Linden y C.A.W. Glas (Eds.), Computerized adaptive testing: Theory and practice (pp. 201-219). Boston: Kluwer-Nijhoff Publishing.
- van Krimpen-Stoop, E.M.L.A. y Meijer, R.R. (2001). CUSUM-based person- fit statistics for adaptive tests with polytomous items. Journal of Educa- tional and Behavioral Statistics, 26, 199-218.
- van Krimpen-Stoop, E.M.L.A. y Meijer, R.R. (2002). Detection of person misfit in computerized adaptive testing. Applied Psychological Measurement, 26, 164-180.
- Wollack, J.A. (1997). A nominal response model approach for detecting an- swer copying. Applied Psychological Measurement, 21, 307-320.
- Wollack, J.A., Cohen, A.S. y Serlin, R.C. (2001). Defining error rates and power for detecting answer copying. Applied Psychological Measurement, 25, 385-404.
- Wright, B.D. y Masters, G.N. (1982). Rating scale analysis. Chicago: MESA Press.
- Wright, B.D. y Stone, M.H. (1979). Best test design. Rasch measurement. Chicago: MESA Press.
- Zickar, M.J. y Drasgow, F. (1996). Detecting faking on a personality instru- ment using appropriateness measurement. Applied Psychological Measure- ment, 20, 71-87. (Artículo recibido: 12-4-05; aceptado: 24-4-06)