Standard item response theory (IRT) models fit to dichotomous examination r
esponses ignore the fact that sets of items (testlets) often come from a si
ngle common stimuli (e.g. a reading comprehension passage). In this setting
, all items given to an examinee are unlikely to be conditionally independe
nt (given examinee proficiency). Models that assume conditional independenc
e will overestimate the precision with which examinee proficiency is measur
ed. Overstatement of precision may lead to inaccurate inferences such as pr
ematurely ending an examination in which the stopping rule is based on the
estimated standard error of examinee proficiency (e.g., an adaptive test).
To model examinations that may be a mixture of independent items and testle
ts, we modified one standard IRT model to include an additional random effe
ct for items nested within the same testlet. We use a Bayesian framework to
facilitate posterior inference via a Data Augmented Gibbs Sampler (DAGS; T
anner & Wong, 1987). The modified and standard IRT models are both applied
to a data set from a disclosed form of the SAT. We also provide simulation
results that indicates that the degree of precision bias is a function of t
he variability of the testlet effects, as well as the testlet design.