Composite undergraduate clinical examinations: how should the components be combined to maximize reliability

Citation
V. Wass et al., Composite undergraduate clinical examinations: how should the components be combined to maximize reliability, MED EDUC, 35(4), 2001, pp. 326-330
Citations number
10
Categorie Soggetti
Health Care Sciences & Services
Journal title
MEDICAL EDUCATION
ISSN journal
03080110 → ACNP
Volume
35
Issue
4
Year of publication
2001
Pages
326 - 330
Database
ISI
SICI code
0308-0110(200104)35:4<326:CUCEHS>2.0.ZU;2-H
Abstract
Background Clinical examinations increasingly consist of composite tests to assess all aspects of the curriculum recommended by the General Medical Co uncil. Setting A final undergraduate medical school examination for 214 students. Aim To estimate the overall reliability of a composite examination, the cor relations between the tests, and the effect of differences in test length, number of items and weighting of the results on the reliability. Method The examination consisted of four written and two clinical tests: mu ltiple-choice questions (MCQ) test, extended matching questions (EMQ), shor t-answer questions (SAQ), essays, an objective structured clinical examinat ion (OSCE) and history-taking long cases. Multivariate generalizability the ory was used to estimate the composite reliability of the examination and t he effects of item weighting and test length. Results The composite reliability of the examination was 0.77, if all tests contributed equally. Correlations between examination components varied, s uggesting that different theoretically interpretable parameters of competen ce were being tested. Weighting tests according to items per test or total test time gave improved reliabilities of 0.93 and 0.81, respectively. Doubl e weighting of the clinical component marginally affected the reliability ( 0.76). Conclusion This composite final examination achieved an overall reliability sufficient for high-stakes decisions on student clinical competence. Howev er, examination structure must be carefully planned and results combined wi th caution. Weighting according to number of items or test length significa ntly affected reliability. The components testing different aspects of know ledge and clinical skills must: be carefully balanced to ensure both conten t validity and parity between items and test length.