Even in the era of the objective structured clinical examination (OSCE
), the predominant method of resident evaluation is the faculty ward e
valuation (WE), despite many concerns about its reliability. The aim o
f this study was to determine the value of the WE as a measurement of
clinical competence in terms of both reliability and validity. In a on
e-year period, surgery faculty members evaluated 72 residents. An aver
age of 7 faculty members evaluated each resident. The evaluation form
contained 10 specific performance ratings and an overall evaluation, I
nter-rater reliability of the overall performance ratings was calculat
ed by using the intraclass correlation. Validity of the WE was evaluat
ed in four ways. Inter-rater reliability of the overall performance ra
ting was 0.82; the reliability of a single overall rating was 0.39. (1
) A discriminant function analysis indicated that residents at advance
d levels of training received more positive evaluations than residents
at less advanced levels (P < 0.0001). (2) The overall rating was sign
ificantly correlated (r = 0.55, P < 0.0001) with the overall score of
a concurrent OSCE. (3) A factor analysis showed high correlations amon
g the items, indicating a lack of discrimination between the skills. (
4) Overall ratings were insensitive to performance deficiencies. Only
1.3% of the ratings were unsatisfactory or marginal. The WE was suffic
iently reliable to estimate the faculty's view of each resident. The f
act that the ratings tended to differentiate residents by level of tra
ining and that ratings significantly correlated with the OSCE provides
strong evidence of their validity. However, factor analysis indicated
that the faculty members were making one global, undifferentiated jud
gment and that these ratings did not identify deficient performance sk
ills. We conclude that ward evaluations have a place in the assessment
of residents. (C) 1997 Academic Press.