This study estimated the interrater reliability of medical student eva
luations of clinical teaching. Data consisted of 1,570 ratings evaluat
ing 147 faculty over a 4-year period in a 3rd-year internal medicine c
lerkship. The number of ratings a typical faculty member receives in a
year was also calculated and used to extrapolate the standard error o
f measurement for data typically available to evaluate faculty at diff
erent time intervals. The data available to evaluate a faculty member
after I year was not adequate, but improved substanially at the 5- to
7-year mark, when a faculty member is typically evaluated for promotio
n and tenure.