Km. Mazor et al., MULTIDIMENSIONAL DIF ANALYSES - THE EFFECTS OF MATCHING ON UNIDIMENSIONAL SUBTEST SCORES, Applied psychological measurement, 22(4), 1998, pp. 357-367
Popular techniques for assessing differential item functioning (DIF) a
ssume that the test under study is unidimensional. When this assumptio
n is tenable, number-correct score is a reasonable matching criterion.
When a test is intentionally multidimensional, matching on a single t
est score does not ensure comparability and may result in inflated err
or rates. An alternate approach is to match on all relevant traits sim
ultaneously, using a procedure such as logistic regression. In this st
udy, data were generated to simulate two-dimensional tests. The dimens
ional structure of the tests, the discrimination levels of the items,
and the correlation between the traits measured by the test were varie
d. Standard DIF analyses were conducted using total test score as the
matching variable. High false-positive error rates were found. Items w
ere divided into subtests using nonlinear factor analysis and DIF anal
yses were repeated with subtest scores as the matching criteria. False
-positive error rates were reduced for most datasets. The dimensional
structure of the test and the discrimination level of the items influe
nced false-positive rates for both sets of DIF analyses. The findings
suggest that assessing the dimensional structure of a test can be an i
mportant first step in DF analysis. If a dataset is intentionally mult
idimensional, conditioning on scores reflecting each dimension can enh
ance the validity of the analyses.