Background In undergraduate clinical examinations, the use of real patients
as long cases is being replaced by objective structured clinical examinati
ons (OSCEs) which use simulated scenarios, although we lack published psych
ometric data on long cases to support the move from real to simulated patie
nts.
Aim To assess candidate performance across two history-taking long cases to
estimate the number of cases required for a reliable assessment. Results a
re compared with psychometric data from an OSCE.
Setting A final-year qualifying undergraduate clinical examination.
Method Two observed history-taking long cases were included, alongside an O
SCE. Candidates interviewed two unstandardized real patients. The history-t
aking part (14 minutes) was observed, uninterrupted, by examiner(s) who ass
essed data gathering, interviewing and diagnostic and management skills. Th
e presentation (7 minutes) was unstructured; the examiner(s) intervened as
appropriate. Marks were expressed as a percentage of the total possible sco
re and analysed using generalizability theory to estimate intercase reliabi
lity.
Results Two examiner pairs independently rated both long cases for 79 (36.7
%) of the 214 candidates. Projections based on generalizability theory show
ed that 10 20-minute cases would give reliabilities of 0.84 for single-mark
ed and 0.88 for double-marked candidates, compared with a projected reliabi
lity of 0.73 for the same 214 candidates taking the OSCE.
Conclusion If history-taking long cases are observed, three-and-a-half hour
s of testing time using 10 unstandardized patients would produce a reliable
test. Long cases therefore are, in terms of reliability, no worse and no b
etter than OSCEs in assessing clinical competence.