Virtually all reviews of cumulated studies rely on statistical signifi
cance as a criterion for evaluating the reproducibility of the phenome
non under review. Despite its nearly universal application, that crite
rion is entirely inadequate: Its application is very likely to lead a
reviewer to conclude that a phenomenon does not discriminate patients
from controls when, in fact, it does do so, The reviewer is, paradoxic
ally, more likely to draw this incorrect conclusion as more studies be
come available for review, It can lead a reviewer to conclude that one
phenomenon is more discriminating than another when the opposite is a
ctually true. Fortunately, procedures that do not distort the review p
rocess are available; some of these are briefly discussed.