Study Design. Reliability study of guidelines development.
Objective. To compare criteria for low back surgery between two expert pane
ls.
Background. Reliability of expert panels for determining appropriateness of
indications for surgical procedures has heretofore received little attenti
on.
Methods. Two multidisciplinary expert panels of similar composition were co
nvened, in the United States and in Switzerland, to evaluate the appropriat
eness of 720 distinct clinical scenarios involving sciatica. Each indicatio
n was assigned to a category of appropriate, uncertain, and inappropriate.
The appropriateness of the 720 theoretical scenarios were compared between
the two panels, and both sets of criteria were applied to two series of act
ual cases.
Results. Seventy-nine percent (n = 566) of the 720 theoretical indications
were assigned to identical categories of appropriateness by both panels (ka
ppa = 0.63; P < 0.001), Only 2 of the 720 scenarios elicited frank disagree
ment. The percentage of the 720 indications that were considered appropriat
e differed between the two panels (U.S.: 3%; Swiss: 11%, P < 0.001), as did
the percentage of intrapanel agreement for indications (U.S.: 51%, Swiss:
64%, P < 0.001). When the same theoretical scenarios were matched with two
series of actual cases (n = 181 end 149) agreement was moderate (kappa = 0.
46) to fair (kappa = 0.30).
Conclusion. There was substantial agreement on the appropriateness of surge
ry for theoretical cases of sciatica between independent expert panels from
two countries. A better understanding of discordant ratings, especially fo
r actual cases, should precede attempts at transposing recommendations eman
ating from a panel in one country to another.