Vsl. Williams et al., A COMPARISON OF DEVELOPMENTAL SCALES BASED ON THURSTONE METHODS AND ITEM RESPONSE THEORY, Journal of educational measurement, 35(2), 1998, pp. 93-107
A developmental scale for the North Carolina End-of-Grade Mathematics
Tests was created using a subset of identical test forms administered
to adjacent grade levels. Thurstone scaling and item response theory (
IRT) techniques were employed to analyze the changes in grade distribu
tions across these linked forms. Three variations of Thurstone scaling
were examined, one based on Thurstone's 1925 procedure and two based
on Thurstone's 1938 procedure. The IRT scaling was implemented using b
oth BIMAIN and MULTILOG. All methods indicated that average mathematic
s performance improved from Grade 3 to Grade 8, with similar results f
or the two IRT analyses and one version of Thurstone's 1938 method. Th
e standard deviations of the IRT scales did not show a consistent patt
ern across grades, whereas those produced by Thurstone's 1925 procedur
e generally decreased; one version of the 1938 method exhibited slight
ly increasing variation with increasing grade level, while the other v
ersion displayed inconsistent trends.