S. Bond et al., Testing a rating scale of video-taped consultations to assess performance of trainee nurse practitioners in general practice, J ADV NURS, 30(5), 1999, pp. 1064-1072
Background: Nurse practitioners (NPs) in the United Kingdom are taking on s
ome of the consultation work previously done by general practitioners (GPs)
without there being any established professional standards that they must
achieve before doing so. There is a need to develop and test methods of ass
essing their consultation performance for reasons of professional accredita
tion and patient safety. Aims: 1. To make independent summative assessments
of trainee nurse practitioners' (TNPs) consultation performance. 2. To ass
ess the validity and reliability of an existing video-taped assessment tool
. Method: Four TNPs taking part in the EROS (extended roles of staff) study
video recorded seven or eight consecutive consultations with typical patie
nts during one surgery. Each consultation was rated nine times by members o
f a panel comprising eight independent GP trainers, four NPs and the GPs an
d TNPs in the EROS practices. A rating scale developed by Cox & Mulholland
was used for the purpose. Results: Eight of the 37 items and four consultat
ions had more than 10% missing data, mean = 7.7 items per rater. Factor ana
lysis yielded a single factor solution explaining 32.6% of the variance and
indicated that items could be summed to provide a single score. Internal c
onsistency was high, alpha coefficient = 0.92. Individual differences betwe
en raters in scoring consultations were taken into account in providing a s
core for each consultation. Scores obtained were found to cluster at the po
sitive end of the distribution indicating a high level of performance. Grea
ter differences were found between scorers than between consultations. Conc
lusions: This instrument is appropriate for scoring NP consultations and th
is small sample was rated as showing a uniformly high standard of performan
ce. Some items could be deleted since they do not feature in the range of c
onsultations currently performed. If this or a similar tool was to be adopt
ed more widely for summative rating purposes then it should be tested rigor
ously for validity and reliability, training should be given to raters and
criteria provided by which to make judgements.