Hypothesis testing, in which the null hypothesis specifies no difference be
tween treatment groups, is an important tool in the assessment of new medic
al interventions. For randomized clinical trials, permutation tests that re
flect the actual randomization are design-based analyses for such hypothese
s. This means that only such design-based permutation tests can ensure inte
rnal validity, without which external validity is irrelevant. However, beca
use of the conservatism of permutation tests, the virtues of permutation te
sts continue to be debated in the literature, and conclusions are generally
of the type that permutation tests should always be used or permutation te
sts should never be used. A better conclusion might be that there are situa
tions in which permutation tests should be used, and other situations in wh
ich permutation tests should not be used. This approach opens the door to b
roader agreement, but begs the obvious question of when to use permutation
tests. We consider this issue from a variety of perspectives, and conclude
that permutation tests are ideal to study efficacy in a randomized clinical
trial which compares, in a heterogeneous patient population, two or more t
reatments, each of which may be most effective in some patients, when the p
rimary analysis does not adjust for covariates. We propose the p-value inte
rval as a novel measure of the conservatism of a permutation test that can
be defined independently of the significance level. This p-value interval c
an be used to ensure that the permutation test have both good global power
and an acceptable degree of conservatism. Copyright (C) 2000 John Wiley & S
ons, Ltd.