Consistency, inter-rater reliability, and validity of 441 consecutive mockoral examinations in anesthesiology: Implications for use as a tool for assessment of residents
A. Schubert et al., Consistency, inter-rater reliability, and validity of 441 consecutive mockoral examinations in anesthesiology: Implications for use as a tool for assessment of residents, ANESTHESIOL, 91(1), 1999, pp. 288-298
Citations number
41
Categorie Soggetti
Aneshtesia & Intensive Care","Medical Research Diagnosis & Treatment
Background: Oral practice examinations (OPEs) are used extensively in many
anesthesiology programs for various reasons, including assessment of clinic
al judgment, Yet oral examinations have been criticized for their subjectiv
ity, The authors studied the reliability, consistency, and validity of thei
r OPE program to determine if it was a useful assessment tool.
Methods: From 1989 through 1993, we prospectively studied 441 OPEs given to
190 residents. The examination format closely approximated that used by th
e American Board of Anesthesiology. Pass-fail grade and an overall numerica
l score were the OPE results of interest. Internal consistency and interrat
er reliability were determined using agreement measures. To assess their va
lidity in describing competence, OPE results were correlated with in-traini
ng examination results and faculty evaluations. Furthermore, we analyzed th
e relationship of OPE with implicit indicators of resident preparation such
as length of training,
Results: The internal consistency coefficient for the overall numerical sco
re was 0.82, indicating good correlation among component scores. The intere
xaminer agreement was 0.68, indicating moderate or good agreement beyond th
at expected by chance. The actual agreement among examiners on pass fail ma
s 84%. Correlation of overall numerical score with in-training examination
scores and faculty evaluations was moderate (r = 0.47 and 0.41, respectivel
y; P < 0.01). OPE results were significantly (P < 0.01) associated with tra
ining duration, previous OPE experience, trainee preparedness, and trainee
anxiety.
Conclusion: Our results show the substantial internal consistency and relia
bility of OPE results at a single institution. The positive correlation of
OPE scores with in-training examination scores, faculty evaluations, and ot
her indicators of preparation suggest that OPEs are a reasonably valid tool
for assessment of resident performance.