Be. Clauser et al., A COMPARISON OF ALTERNATIVE MATCHING STRATEGIES FOR DIF DETECTION IN TESTS THAT ARE MULTIDIMENSIONAL, Journal of educational measurement, 33(2), 1996, pp. 202-214
Most currently accepted approaches for identifying differentially func
tioning rest items compare performance across groups after first match
ing examinees on the ability of interest. The typical basis for this m
atching is the total test score. Previous research indicates that when
the test is not approximately unidimensional, matching using the tota
l test score may result in an inflated Type I error rate. This study c
ompares the results of differential item functioning (DIF) analysis wi
th matching based on the total test score, matching based on subtest s
cores, or multivariate matching using multiple subtest scores. Analysi
s of both actual and simulated data indicate that for the dimensionall
y complex test examined in this study, using the total test score as t
he matching criterion is inappropriate. The results suggest that match
ing on multiple subtest scores simultaneously may be superior to using
either the total test score or individual relevant subtest scores.