New goodness-of-fit indices are introduced for dichotomous item response th
eory (IRT) models. These indices are based on the likelihoods of number-cor
rect scores derived from the IRT model, and they provide a direct compariso
n of the modeled and observed frequencies for correct and incorrect respons
es for each number-correct score. The behavior of Pearson's chi (2) (S-chi
(2)) and the likelihood ratio G(2) (S-G(2)) was assessed in a simulation st
udy and compared with two fit indices similar to those currently in use (Q(
1)-X-2 and Q(1)-G(2)). The simulations included three conditions in which t
he simulating and fitting models were identical and three conditions involv
ing model misspecification. S-X-2 performed well, with Type I error rates c
lose to the expected .05 and .01 levels. Performance of this index improved
with increased test length. S-G(2) tended to reject the null hypothesis to
o often, as did Q(1)-X-2 and Q(1)-G(2) The power of S-X-2 appeared to be si
milar for all test lengths, but varied depending on the type of model missp
ecification.