In this article, we address the problem of improving the measurement qualit
y of a complex performance assessment through principled assessment design.
We describe the characteristics and measurement impact of steps taken to i
mprove assessment exercise design along with modifications in assessor trai
ning materials and procedures between the 1995-1996 and the 1996-1997 admin
istrations of the National Board for Professional Teaching Standards Early
Childhood/Generalist examination. Specifically, we describe how the revisio
n of this assessment contributed to increases in the interassessor agreemen
t, internal consistency, and generalizability of scores. All indexes we exa
mined improved after the revisions. The results suggest that previously obs
erved limits on the measurement quality of performance assessments due to t
he relatively small number of items that contribute to an assessment score
may be altered significantly through attention to assessment design and rel
ated scoring processes.