This paper describes four procedures previously developed for estimating co
nditional standard errors of measurement for scale scores: the IRT procedur
e (Kolen, Zeng, & Hanson, 1996), the binomial procedure (Brennan di Lee, 19
99), the compound binomial procedure (Brennan & Lee, 1999), and the Feldt-Q
ualls procedure (1998). These four procedures are based on different underl
ying assumptions. The IRT procedure is based on the unidimensional IRT mode
l assumptions. The binomial and compound binomial procedures employ, as the
distribution of errors, the binomial model and compound binomial model, re
spectively By contrast, the Feldt-Qualls procedure does not depend an a par
ticular psychometric model, and it simply translates any estimated conditio
nal raw-score SEM to a conditional scale-scare SEM. These procedures are co
mpared in a simulation study, which involves two-dimensional data sets. The
presence of two category dimensions reflects a violation of the IRT unidim
ensionality assumption. The relative accuracy of these procedures for estim
ating conditional scale-score standard errors of measurement is evaluated u
nder various circumstances. The effects Of three different types of transfo
rmations of raw scores are investigated including developmental standard sc
ores, grade equivalents, and percentile ranks. All the procedures discussed
appear viable. A general recommendation is made that rest users select a p
rocedure based on various factors such as the type of scale score of concer
n, characteristics of the test, assumptions involved in the estimation proc
edure, and feasibility and practicability of the estimation procedure.