G. Bullock et al., Evaluating procedural skills competence: Inter-rater reliability of expertand non-expert observers, ACAD MED, 74(1), 1999, pp. 76-78
Purpose. To examine the inter-rater reliability of expert and non-expert ob
servers when they used objective structured checklists to evaluate candidat
es' performances on three simulated medical procedures.
Method. Simulations and structured checklists were developed for three medi
cal procedures: endotracheal intubation, application of a forearm cast, and
suturing a simple skin laceration. Groups comprised of two expert and two
non-expert observers scored the performances of 101 procedures by 38 medica
l trainees and practitioners of varying skill levels. Inter-rater reliabili
ty was assessed using Pearson correlation coefficients.
Results. Inter-rater reliability was good for expert/expert, expert/non-exp
ert, and non-expert/non-expert pairings in all three skills simulations.
Conclusion. Both expert and non expert observers demonstrated good inter ra
ter reliability when using structured checklists to assess procedural skill
s. Further study is required to determine whether this conclusion may be ex
trapolated to other study groups or procedures.