The two-stage design involves sample size recalculation using an interim va
riance estimate. Stein proposed the design in 1945; biostatisticians recent
ly have shown renewed interest in it. Wittes and Brittain proposed a modifi
cation aimed at greater efficiency; Could and Shih proposed a similar proce
dure, but with a different interim variance estimate based on blinded data.
We compare the power of Stein's original test, an idealized version of the
Wittes-Brittain test, and a theoretical optimal test which can be approxim
ated in practice. We also compare two procedures that control the condition
al type I error rate given the actual final sample size: Gould and Shih's p
rocedure and a newly proposed 'second segment' procedure. The comparison am
ong the first three procedures indicates that the Stein test is, unexpected
ly, the test of choice under the original design alternative, whereas the a
pproximate-optimal and Wittes-Brittain procedures appear to have superior p
ower for detecting smaller treatment differences. As between the latter two
procedures, the second segment procedure is more powerful when many observ
ations are likely to be taken after the interim resizing, whereas otherwise
the Gould-Shih procedure is superior. Copyright (C) 1999 John Wiley & Sons
, Ltd.