ITA
ENG

Estimators of conditional scale-score standard errors of measurement: A simulation study

Authors

Lee, WC Brennan, RL Kolen, MJ

Citation

Wc. Lee et al., Estimators of conditional scale-score standard errors of measurement: A simulation study, J EDUC MEAS, 37(1), 2000, pp. 1-20

Citations number

Categorie Soggetti

Psycology

Journal title

JOURNAL OF EDUCATIONAL MEASUREMENT

ISSN journal

00220655 → ACNP

Volume

Issue

Year of publication

2000

Pages

1 - 20

Database

ISI

SICI code

0022-0655(200021)37:1<1:EOCSSE>2.0.ZU;2-D

Abstract

This paper describes four procedures previously developed for estimating co nditional standard errors of measurement for scale scores: the IRT procedur e (Kolen, Zeng, & Hanson, 1996), the binomial procedure (Brennan di Lee, 19 99), the compound binomial procedure (Brennan & Lee, 1999), and the Feldt-Q ualls procedure (1998). These four procedures are based on different underl ying assumptions. The IRT procedure is based on the unidimensional IRT mode l assumptions. The binomial and compound binomial procedures employ, as the distribution of errors, the binomial model and compound binomial model, re spectively By contrast, the Feldt-Qualls procedure does not depend an a par ticular psychometric model, and it simply translates any estimated conditio nal raw-score SEM to a conditional scale-scare SEM. These procedures are co mpared in a simulation study, which involves two-dimensional data sets. The presence of two category dimensions reflects a violation of the IRT unidim ensionality assumption. The relative accuracy of these procedures for estim ating conditional scale-score standard errors of measurement is evaluated u nder various circumstances. The effects Of three different types of transfo rmations of raw scores are investigated including developmental standard sc ores, grade equivalents, and percentile ranks. All the procedures discussed appear viable. A general recommendation is made that rest users select a p rocedure based on various factors such as the type of scale score of concer n, characteristics of the test, assumptions involved in the estimation proc edure, and feasibility and practicability of the estimation procedure.