Jw. Fleenor et al., INTERRATER RELIABILITY AND AGREEMENT OF PERFORMANCE RATINGS - A METHODOLOGICAL COMPARISON, Journal of business and psychology, 10(3), 1996, pp. 367-380
This paper demonstrates and compares methods for estimating the interr
ater reliability and interrater agreement of performance ratings. Thes
e methods can be used by applied researchers to investigate the qualit
y of ratings gathered, for example, as criteria for a validity study,
or as performance measures for selection or promotional purposes. Whil
e estimates of interrater reliability are frequently used for these pu
rposes, indices of interrater agreement appear to be rarely reported f
or performance ratings. A recommended index of interrater agreement, t
he T index (Tinsley & Weiss, 1975), is compared to four methods of est
imating interrater reliability (Pearson r, coefficient alpha, mean cor
relation between raters, and intraclass correlation). Subordinate and
superior ratings of the performance of 100 managers were used in these
analyses. The results indicated that, in general, interrater agreemen
t and reliability among subordinates were fairly high. Interrater agre
ement between subordinates and superiors was moderately high; however,
interrater reliability between these two rating sources was very low.
The results demonstrate that interrater agreement and reliability are
distinct indices and that both should be reported. Reasons are discus
sed as to why interrater reliability should not be reported alone.