Standard setting in an objective structured clinical examination: use of global ratings of borderline performance to determine the passing score

Citation
T. Wilkinson et al., Standard setting in an objective structured clinical examination: use of global ratings of borderline performance to determine the passing score, MED EDUC, 35(11), 2001, pp. 1043-1049
Citations number
6
Categorie Soggetti
Health Care Sciences & Services
Journal title
MEDICAL EDUCATION
ISSN journal
03080110 → ACNP
Volume
35
Issue
11
Year of publication
2001
Pages
1043 - 1049
Database
ISI
SICI code
0308-0110(200111)35:11<1043:SSIAOS>2.0.ZU;2-9
Abstract
Background Objective structured clinical examination (OSCE) standard-settin g procedures are not well developed and are often time-consuming and comple x. We report an evaluation of a simple 'contrasting groups' method, applied to an OSCE conducted simultaneously in three separate schools. Subjects Medical students undertaking an end-of-fifth year multidisciplinar y OSCE. Methods Using structured marking sheets, pairs of examiners independently s cored student performance at each OSCE station. Examiners also provided a g lobal rating of overall performance. The actual scores of any borderline ca ndidates at each station were averaged to provide a passing score for each station. The passing scores for all stations were combined to become the pa ssing score for the whole exam. Validity was determined by making compariso ns with performance on other fifth-year assessments. Reliability measures c omprised interschool agreement, interexaminer agreement and interstation va riability. Results The approach was simple and had face validity. There was a stronger association between the performance of borderline candidates on the OSCE a nd their in-course assessments than with their performance on the written e xam, giving a weak measure of construct validity in the absence of a better 'gold standard'. There was good agreement between examiners in identifying borderline candidates. There were significant differences between schools in the borderline score for some stations, which disappeared when more than three stations were aggregated. Conclusion This practical method provided a valid and reliable competence-b ased pass mark. Combining marks from all stations before determining the pa ss mark was more reliable than making decisions based on individual station s.