Using generalizability theory as a guide, this study discusses statistical
problems and strategies of analyzing longitudinal rating data involving mul
tiple raters-a common type of data issue frequently encountered in social w
ork evaluations. To disentangle raters' bios from clients' true change, the
study shows the importance of looking into the multifaceted structure of m
easurement error. To analyze data containing nonnegligible variability asso
ciated with raters, this study proposes using a three-level hierarchical li
near model. It demonstrates that the three-level model produces a better mo
del fit to the data, smaller sample residual, and more accurate significanc
e testing than the popular two-level model when analyzing rating data with
nonnegligible raters' influences.