Objectives: The objectives of this paper were: a) to determine what can be
learned from conclusions of systematic reviews about the evidence base of m
edicine; and b) to determine whether two readers draw similar conclusions f
rom the same review, and whether these match the authors' conclusions.
Methods: Three methodologists (two per review) rated 160 Cochrane systemati
c reviews (issue 1, 1998) using pre-established conclusion categories. Disa
greements were resolved by discussion to arrive at a consensual score for e
ach review. Reviews' authors were asked to use the same categories to desig
nate the intended conclusion. Interrater agreements were calculated.
Results: Interrater agreement between two readers was 0.68 and 0.72, and be
tween readers and authors, 0.32. The largest categories assigned by methodo
logists were "positive effect" (22.5%), "insufficient evidence" (21.3%), an
d "evidence of no effect" (20.0%). The largest categories assigned by autho
rs were "insufficient evidence" (32.4%), "possibly positive" (28.6%), and "
positive effect" (26.7%).
Conclusions: The number of reviews indicating that the modern biomedical in
terventions show either no effect or insufficient evidence is surprisingly
high. Intterrater disagreements suggest a surprising degree of subjective i
nterpretation involved in systematic reviews. Where patterns of disagreemen
t emerged between authors and readers, authors tended to be more optimistic
in their conclusions than the readers. Policy implications are discussed.