Aberrant Patterns Detection Methods

Rosa Salas

Outline

Aberrant Patterns Detection Methods

Rosa Salas

2006, Anales De Psicologia

Abstract

Resumen: La identificación de patrones de respuesta atípicos es de gran utilidad para la construcción de tests y de bancos de ítems con propiedades psicométricas así como para el análisis de validez de los mismos. En este trabajo de revisión se han recogido los más relevantes y novedosos métodos de ajuste de personas que se han elaborado dentro de cada uno de los

References (101)

Angoff, W.H. (1982). Uses of difficulty and discrimination indices for de- tecting item bias. En R.A. Berk (Ed.), Handbook of methods for detecting item bias (pp. 96-116). Baltimore, MD: Johns Hopkins University Press.
Baillie, R. y Tatsuoka, K.K. (1982). TCC3 [Computer program].
Birenbaum, M. (1986). Effect of dissimulation motivation and anxiety on re- sponse pattern appropriateness measures. Applied Psychological Measure- ment, 10, 167-174.
Bock, R.D. (1972). Estimating item parameters and latent ability when re- sponses are scored in two or more latent categories. Psychometrika, 37, 29-51.
Bradlow, E.T. y Weiss, R.E. (2001). Outlier measures and norming methods for computerized adaptive tests. Journal of Educational and Behavioral Sta- tistics, 26, 85-104.
Bradlow, E.T., Weiss, R.E. y Cho, M. (1998). Bayesian identification of out- liers in computerized adaptive tests. Journal of the American Statistical Asso- ciation, 93, 910-919.
Cronbach, L.J., Gleser, G.C., Nanda, H. y Rajaratnam, N. (1972). The depend- ability of behavioral measurements: Theory of generalizability for scores and profiles. New York: Wiley.
Donlon, T.F. y Fischer, F.E. (1968). An index of an individual's agreement with group-determined item difficulties. Educational and Psychological Measurement, 28, 105-113.
Doval, E., Núñez, M.I., Renom, J. y Solanas, A. (2001). X-PAT: Un explora-
Drasgow, F. (1982). Choice of test models for appropriateness measure- ment. Applied Psychological Measurement, 6, 297-308.
Drasgow, F. y Levine, M.V. (1986). Optimal detection of certain forms of inappropriate test scores. Applied Psychological Measurement, 10, 59-67.
Drasgow, F., Levine, M.V. y McLaughlin, M.E. (1987). Detecting inappro- priate test scores with optimal and practical appropriateness indices. Applied Psychological Measurement, 11, 59-79.
Drasgow, F., Levine, M.V. y McLaughlin, M.E. (1991). Appropriateness for some multidimensional test batteries. Applied Psychological Measurement, 15, 171-191.
Drasgow, F., Levine, M.V. y Williams, E.A. (1985). Appropriateness meas- urement with polychotomous item response models and standardized indices. British Journal of Mathematical and Statistical Psychology, 38, 67-86.
Drasgow, F., Levine, M.V. y Zickar, M.J. (1996). Optimal identification of mismeasured individuals. Applied Measurement in Education, 9, 47-64.
Emons, W.H.M., Sijtsma, K. y Meijer, R.R. (2004). Testing hypotheses about the person-response function in person-fit analysis. Multivariate Behav- ioral Research, 39, 1-35.
Emons, W.H.M., Glas, C.A.W., Meijer, R.R. y Sijtsma, K. (2003). Person fit in order-restricted latent class models. Applied Psychological Measurement, 27, 459-478.
Frary, R.B., Tideman, T.N. y Watts, T.M. (1977). Indices of cheating on multiple-choice tests. Journal of Educational Statistics, 2, 235-256.
Ferrando, P.J. (2004). Person reliability in personality measurement: an item response theory analysis. Applied Psychological Measurement, 28, 126-140.
Ferrando, P.J. y Lorenzo, U. (2000). WPERFIT: A program for computing parametric person-fit statistics and plotting person response curves. Educational and Psychological Measurement, 60, 479-487.
Glas, C.A.W. y Ellis, J. (1994). Computer programs: RSP. Rasch Measurement Transactions, 8, 339-340.
Guttman L. (1944). A basis for scaling qualitative data. American Sociological Review, 9, 139-150.
Guttman L. (1950). The basis for scalogram analysis. En S.A. Stouffer, L. Guttman, E.A. Suchman, P.F. Lazarsfeld, S.A. Star y J.A. Clausen (Eds.), Measurement and prediction. Studies in social psychology in World War II (Vol. 4) (pp. 60-90). Princeton, NJ: Princeton University Press.
Harnisch, D.L. (1983). Item response patterns: Applications for educational practice. Journal of Educational Measurement, 20, 191-206.
Harnisch, D.L. y Linn, R.L. (1981). Analysis of item response patterns: Questionable test data and dissimilar curriculum practices. Journal of Educational Measurement, 18, 133-146.
Harnisch, D.L. y Tatsuoka, K.K. (1983). A comparison of appropriateness indices based on item response theory. En R.K. Hambleton (Ed.), Ap- plications of item response theory (pp. 104-122). Vancouver, Canada: Kluwer- Nijhoff Publishing.
Hendrawan, I., Glas, C.A.W. y Meijer, R.R. (2005). The effect of person mis- fit on classification decisions. Applied Psychological Measurement, 29, 26-44.
Kane, M.T. y Brennan, R.L. (1980). Agreement coefficients as indices of de- pendability for domain-referenced tests. Applied Psychological Measurement, 4, 105-126.
Karabatsos, G. (2003). Comparing the aberrant response detection perform- ance of thirty-six person-fit statistics. Applied Measurement in Education, 16, 277-298.
Klauer, K.C. (1995). The assessment of person fit. En G.H. Fischer y I.W. Molenaar (Eds.), Rasch models: Foundations, recent developments and applica- tions (pp. 97-110). New York: Springer-Verlag.
Klauer, K.C. y Rettig, K. (1990). An approximately standardized person test for assessing consistency with a latent trait model. British Journal of Mathematical and Statistical Psychology, 43, 193-206.
Kogut, J. (1988). Asymptotic distribution of a person-fit statistic (Research Report No. 88-13). Enschede, The Netherlands: University of Twente.
Levine, M.V. y Drasgow, F. (1982). Appropriateness measurement: Review, critique and validating studies. British Journal of Mathematical and Statistical Psychology, 35, 42-56.
Levine, M.V. y Drasgow, F. (1983a). Appropriateness measurement: Validat- ing studies and variable ability models. En D.J. Weiss (Ed.), New horizons in testing: Latent trait test theory and computerized adaptive testing (pp. 109-131). New York: Academic Press.
Levine, M.V. y Drasgow, F. (1983b). The relation between incorrect option choice and estimated ability. Educational and Psychological Measurement, 43, 675-685.
Levine, M.V. y Drasgow, F. (1984). Performance envelops and optimal appropriate- ness measurement (Report No. 84-5). Champaign, IL: University of Illi- nois, Department of Educational Psychology, Model-based Measure- ment Laboratory. (ERIC Document Reproduction Service No. ED 263 126).
Levine, M.V. y Rubin, B.D. (1979). Measuring the appropriateness of multi- ple-choice test scores. Journal of Educational Statistics, 4, 269-290.
Li, M.F. y Olejnik, S. (1997). The power of Rasch person-fit statistics in de- tecting unusual response patterns. Applied Psychological Measurement, 21, 215-231.
Loevinger, J. (1947). A systematic approach to the construction and evalua- tion of tests of ability. Psychological Monograph, 61 (No. 4).
Loevinger, J. (1948). The technique of homogeneous tests compared with some aspects of scale analysis and factor analysis. Psychological Bulletin, 45, 507-530.
Lord, F.M. (1968). An analysis of the Verbal Scholastic Aptitude Test using Birnbaum's three-parameter logistic model. Educational and Psychological Measurement, 28, 989-1020.
McLeod, L.D. y Lewis, C. (1999). Detecting item memorization in the CAT environment. Applied Psychological Measurement, 23, 147-160.
McLeod, L.D., Lewis, C. y Thissen, D. (2003). A bayesian method for the detection of item preknowledge in computerized adaptive testing. Ap- plied Psychological Measurement, 27, 121-137.
Meijer, R.R. (1994). The number of Guttman errors as a simple and power- ful person-fit statistic. Applied Psychological Measurement, 18, 311-314.
Meijer, R.R. (1995). A supplement to "The number of Guttman errors as a simple and powerful person-fit statistic". Applied Psychological Measure- ment, 19, 166.
Meijer, R.R. (2002). Outlier detection in hihg-stakes certification testing. Journal of Educational Measurement, 39, 219-233.
Meijer, R.R. (2003). Diagnosing item score patterns on a test using item re- sponse theory-based person-fit statistics. Psychological Methods, 8, 72-87.
Meijer, R.R., Muijtjens, M.M. y van der Vleuten, C.P.M. (1996). Nonpara- metric person-fit research: Some theoretical issues and an empirical ex- ample. Applied Measurement in Education, 9, 77-89.
Meijer, R.R. y Nering, M.L. (1997). Trait level estimation for nonfitting re- sponse vectors. Applied Psychological Measurement, 21, 321-336.
Meijer, R.R. y Sijtsma, K. (1999). A review of methods for evaluating the fit of item score patterns on a test (Research Report No. 99-01). Twente, The Nether- lands: University of Twente, Department of Educational Measurement and Data Analysis.
Meijer, R.R. y Sijtsma, K. (2001). Methodology review: Evaluating person fit. Applied Psychological Measurement, 25, 107-135.
Meijer, R.R., Sijtsma, K. y Smid, N.G. (1990). Theoretical and empirical comparison of the Mokken and the Rasch approach to IRT. Applied Psy- chological Measurement, 14, 283-298.
Miller, M.D. (1986). Time allocation and patterns of item response. Journal of Educational Measurement, 23, 147-156.
Mokken, R.J. (1971). A theory and procedure of scale analysis. The Netherlands: Mounton.
Molenaar, I.W. y Hoijtink, H. (1990). The many null distributions of person fit indices. Psychometrika, 55, 75-106.
Molenaar, I.W. y Hoijtink, H. (1996). Person-fit and the Rasch model, with an application to knowledge of logical quantors. Applied Measurement in Education, 9, 27-45.
Nelson, R.B. y Chatman, S.P. (1985). RASCH/ECIZ: A SAS PROC MA- TRIX program for Rasch analysis and person-fit statistics. Applied Psy- chological Measurement, 9, 325.
Nering, M.L. (1995). The distribution of person fit using true and estimated person parameters. Applied Psychological Measurement, 19, 121-129.
Nering, M.L. (1997). The distribution of indexes of person fit within the computerized adaptive testing environment. Applied Psychological Meas- urement, 21, 115-127.
Nering, M.L. y Meijer, R.R. (1998). A comparison of the person response function and the person-fit statistic. Applied Psychological Measurement, 22, 53-69.
l Noonan, B.W., Boss, M.W. y Gessaroli, M.E. (1992). The effect of test length and IRT model on the distribution and stability of three appro- priateness indexes. Applied Psychological Measurement, 16, 345-352.
Núñez, R.M. (2002). Propiedades distribucionales de un estadístico de medición apro- piada. Tesis doctoral no publicada. Universidad de Murcia.
Page, E.S. (1954). Continuous inspection schemes. Biometrika, 41, 100-115.
Parsons, C.K. (1983). The identification of people for whom job descriptive index scores are inappropriate. Organizational Behaviour and Human Per- formance, 33, 365-393.
Reise, S.P. (1990). A comparison of item-and person-fit methods of assess- ing model-data fit in IRT. Applied Psychological Measurement, 14, 127-137.
Reise, S.P. (1995). Scoring method and the detection of person misfit in a personality assessment context. Applied Psychological Measurement, 19, 213- 229.
Reise, S.P. y Flannery, Wm. P. (1996). Assessing person-fit on measures of typical performance. Applied Measurement in Education, 9, 9-26.
Reise, S.P. y Waller, N.G. (1993). Traitedness and the assessment of re- sponse pattern scalability. Journal of Personality and Social Psychology, 65, 143-151.
Reise, S.P. y Widaman, K.F. (1999). Assessing the fit of measurement mod- els at the individual level: A comparison of item response theory and covariance structure approaches. Psychological Methods, 4, 3-21.
Rogers, H.J. y Hattie, J.A. (1987). A Monte Carlo investigation of several person and item fit statistics for item response models. Applied Psycho- logical Measurement, 11, 47-57.
Rosenbaum, P.R. (1987). Probability inequalities for latent scales. British Journal of Mathematical and Statistical Psychology, 40, 157-168.
Rudner, L.M. (1983). Individual assessment accuracy. Journal of Educational Measurement, 20, 207-219.
Rudner, L.M., Bracey, G. y Skaggs, G. (1996). The use of a person-fit statis- tic with one high-quality achievement test. Applied Measurement in Educa- tion, 9, 91-109.
Sato, T. (1975). The construction and interpretation of S-P tables. Tokyo: Meiji To- sho. (En japonés).
Schmitt, N., Chan, D., Sacco, J.M., McFarland, L.A. y Jennings, D. (1999). Correlates of person fit and effect of person fit on test validity. Applied Psychological Measurement, 23, 41-53.
Sijtsma, K. (1986). A coefficient of deviance of response patterns. Kwantitati- eve Methoden, 7, 131-145.
Sijtsma, K. y Meijer, R.R. (1992). A method for investigating the intersection of the item response functions in Mokken's nonparametric IRT model. Applied Psychological Measurement, 16, 149-157.
Sijtsma, K. y Meijer, R.R. (2001). The person response function as a tool in person-fit research. Psychometrika, 66, 191-208.
Smith, R.M. (1985). A comparison of Rasch person analysis and robust es- timators. Educational and Psychological Measurement, 45, 433-444.
Smith, R.M. (1991). IPARM: Item and person analysis with the Rasch model. Chi- cago: MESA Press.
Snijders, T.A.B. (2001). Asymptotic null distribution of person fit statistics with estimated person parameter. Psychometrika, 66, 331-342.
Tatsuoka, K.K. (1984). Caution indices based on item response theory. Psy- chometrika, 49, 95-110.
Tatsuoka, K.K. (1996). Use of generalized person-fit indexes, zetas for sta- tistical pattern classification. Applied Measurement in Education, 9, 65-75.
Tatsuoka, K.K. y Linn, R.L. (1983). Indices for detecting unusual patterns: Links between two general approaches and potential applications. Ap- plied Psychological Measurement, 7, 81-96.
Tatsuoka, K.K. y Tatsuoka, M.M. (1982). Detection of aberrant response patterns and their effect on dimensionality. Journal of Educational Statistics, 7, 215-231.
Tatsuoka, K.K. y Tatsuoka, M.M. (1983). Spotting erroneous rules of opera- tion by the individual consistency index. Journal of Educational Measure- ment, 7, 215-231.
Trabin, T.E. y Weiss, D.J. (1979). The person response curve: Fit of individuals to item characteristic curve models (Research Report No. 79-7). Minneapolis, MN: University of Minnesota, Department of Psychology, Psychomet- ric Methods Program.
Trabin, T.E. y Weiss, D.J. (1983). The person response curve: Fit of indi- viduals to item response theory models. En D.J. Weiss (Ed.), New hori- zons in testing: Latent trait test theory and computerized adaptive testing (pp. 83- 108). New York: Academic Press.
van der Flier, H. (1977). Environmental factors and deviant response pat- terns.
En Y.H. Poortinga (Ed.), Basic problems in Cross-Cultural Psychology. Amsterdam: Swets and Zeitlinger.
van der Flier, H. (1980). Vergelijkbaarheid van individuele testprestaties [Compara- bility of individual test performance].
van der Flier, H. (1982). Deviant response patterns and comparability of test scores. Journal of Cross-Cultural Psychology, 13, 267-298.
van Krimpen-Stoop, E.M.L.A. y Meijer, R.R. (1999). The null distribution of person-fit statistics for conventional and adaptive tests. Applied Psycho- logical Measurement, 23, 327-345.
van Krimpen-Stoop, E.M.L.A. y Meijer, R.R. (2000). Detecting person- misfit in adaptive testing using statistical process control techniques. En W.J. van der Linden y C.A.W. Glas (Eds.), Computerized adaptive testing: Theory and practice (pp. 201-219). Boston: Kluwer-Nijhoff Publishing.
van Krimpen-Stoop, E.M.L.A. y Meijer, R.R. (2001). CUSUM-based person- fit statistics for adaptive tests with polytomous items. Journal of Educa- tional and Behavioral Statistics, 26, 199-218.
van Krimpen-Stoop, E.M.L.A. y Meijer, R.R. (2002). Detection of person misfit in computerized adaptive testing. Applied Psychological Measurement, 26, 164-180.
Wollack, J.A. (1997). A nominal response model approach for detecting an- swer copying. Applied Psychological Measurement, 21, 307-320.
Wollack, J.A., Cohen, A.S. y Serlin, R.C. (2001). Defining error rates and power for detecting answer copying. Applied Psychological Measurement, 25, 385-404.
Wright, B.D. y Masters, G.N. (1982). Rating scale analysis. Chicago: MESA Press.
Wright, B.D. y Stone, M.H. (1979). Best test design. Rasch measurement. Chicago: MESA Press.
Zickar, M.J. y Drasgow, F. (1996). Detecting faking on a personality instru- ment using appropriateness measurement. Applied Psychological Measure- ment, 20, 71-87. (Artículo recibido: 12-4-05; aceptado: 24-4-06)

Aberrant Patterns Detection Methods

Sign up for access to the world's latest research

Abstract

Related papers

References (101)

Related papers

Related topics