BACKGROUND. Assess the reproducibility of methods to measure overuse of cat
aract surgery.
OBJECTIVES. The objectives of this study are: (1) To determine the extent o
f agreement about clinical scenarios among, between, and within three physi
cian panels; (2) to apply ratings of clinical scenarios from three panels t
o actual surgeries; and (3) to assess reproducibility of rates of appropria
te use and overuse.
METHODS. Three physician panels scored 2,894 clinical scenarios for the app
ropriate use of cataract surgery. One thousand and twenty charts were abstr
acted and assigned to the clinical scenario that best corresponded to the p
atient's clinical situation. Two hundred and fifty nine clinical scenarios
were required to assign the cases. Weighted kappa values, confidence interv
als, and percentages of agreement were used to measure agreement among, bet
ween, and within panels.
RESULTS. The all ophthalmologist panel (OP) and the convened multispecialty
panel (CM) each rate 92% of the cases as appropriate use, compared with 70
% by the mail-in multispecialty panel (MM). The MM have higher uncertain (2
6% vs. 8% and 7%) and higher inappropriate use (3.5% vs. 0.1% and 1.9%). Fo
r the clinical scenarios, the CM and the MM have similar percentages of ove
ruse (6.6%, 7.3%), in contrast to the OP (0.4%). The weighted kappa value f
or the overall level of agreement about the clinical scenarios among the th
ree panels is 0.53, consistent with moderate agreement.
CONCLUSIONS. Study results demonstrate reproducibility for assessment of ap
propriate use of surgery between the OP and CM. However, both multispecialt
y panels rate more clinical scenarios as inappropriate use than the ophthal
mologist panel. Thus, reproducibility between the CM and the OP may be attr
ibutable to the low percentage of overuse of cataract surgery in the study
population. The overall level of agreement about the clinical scenarios amo
ng the panels is moderate.