Rr. Sitter, VARIANCE-ESTIMATION FOR THE REGRESSION ESTIMATOR IN 2-PHASE SAMPLING, Journal of the American Statistical Association, 92(438), 1997, pp. 780-787
Many techniques in survey sampling depend on the possession of informa
tion about an auxiliary variable x, or a vector of auxiliary variables
, available for the entire population. Regression estimates require (X
) over bar, the population mean. If such information is unavailable, t
hen one can sometimes obtain a large preliminary sample of zi relative
ly cheaply and use this to obtain a good estimate. say <(x)over bar '>
, of (X) over bar. A smaller subsample can then be taken and the chara
cteristic of interest, y(i), measured. A regression estimator can then
be used treating <(x)over bar '> as if it were (X) over bar. This is
termed double sampling, or two-phase sampling. This article focuses on
variance estimators for the regression estimator in the aforementione
d context and their use in constructing confidence intervals. A design
-based linearization variance estimator that makes more complete use o
f the sample data than the standard one is considered for two-phase sa
mpling. A jackknife variance estimator and its linearized version are
obtained and shown to be design consistent. A bootstrap variance estim
ator is also shown to be design consistent. Unconditional and conditio
nal repeated sampling properties of these variance estimators are stud
ied through simulation. It is shown that the linearization variance es
timator displays superior unconditional properties, but the jackknife
ana its linearized version perform better conditionally.