With the use of computerized adaptive testing becoming increasingly popular
, techniques that allow one to test for differential item functioning (DIF)
when not all examinees take the same items, or even the same number of ite
ms, are being developed. Roussos (1996) developed a program known as CATSIB
that allows one to identify items that exhibit DIF using an examinee's est
imate of ability as the conditioning variable, rather than total test score
. CATSIB employs what is commonly referred to as a regression correction to
control for estimation bias, and inflated Type I errors that occur when th
e reference and focal groups differ in their observed score distributions.
Simulation studies were conducted that compare the power and Type I error r
ates for 2 conditions: using an examinee's ability estimate as the conditio
ning variable (CATSIB), and either employing the regression correction or n
ot. In addition, power and Type I error rates were examined when using tota
l test score as the conditioning variable (SIBTEST). Type I error rates wer
e examined both for DIF-free items, when other items within the test displa
yed DIF (local Type I error) and when no DIF was present in the data (globa
l Type I error). For each of these cases, 3 different types of DIF were exp
lored: uniform, ordinal, and disordinal.