Sv. Stehman et Ws. Overton, COMPARISON OF VARIANCE ESTIMATORS OF THE HORVITZ-THOMPSON ESTIMATOR FOR RANDOMIZED VARIABLE PROBABILITY SYSTEMATIC-SAMPLING, Journal of the American Statistical Association, 89(425), 1994, pp. 30-43
The National Stream Survey (NSS) and Environmental Monitoring and Asse
ssment Program (EMAP) use variable probability, systematic sampling, a
nd the Horvitz-Thompson estimator to estimate population parameters of
ecological interest. A common strategy of variance estimation for sys
tematic sampling is to assume that the population order had been rando
mized prior to sampling and to estimate variance under this randomized
population model. The Yates-Grundy variance estimator is generally re
commended for estimating the variance of the Horvitz-Thompson estimato
r under this model. But design features of NSS and EMAP preclude appli
cation of the Yates-Grundy estimator, so use of the Horvitz-Thompson v
ariance estimator is required. Further, because the first-order inclus
ion probabilities are known only for the sample units and not the enti
re population, neither the actual pairwise inclusion probabilities (pi
(uv)'s) nor the Hartley-Rao approximation of the pi(uv)'s can be compu
ted. Thus the variance estimator proposed for use in these surveys was
the Horvitz-Thompson variance estimator computed with a new approxima
tion to the pi(uv)'s. Having to use this estimator, denoted upsilon(HT
)o, motivated exploration of the general question of when behaviors of
the Horvitz-Thompson and Yates-Grundy variance estimators differ and
also investigation of the specific performance of the estimator upsilo
n(HT)o. To permit comparison of variance estimators, we restricted att
ention to fixed sample size, variable probability systematic sampling,
from a randomly sorted list. Properties of upsilon(HT)o were compared
to those of three other variance estimators: the Yates-Grundy estimat
or calculated with both the new pi(uv) approximation and the Hartley-R
ao approximation, and the Horvitz-Thompson variance estimator calculat
ed with the Hartley-Rao approximation. An empirical study, designed to
permit generalization beyond a few special case populations, demonstr
ated that superiority of the Yates-Grundy variance estimator was restr
icted to populations having both high correlation between the response
variable, y, and the selection variable, x, and approximately equal c
oefficients of variation for the x and y populations. With the excepti
on of these populations, upsilon(HT)o performed nearly the same as the
Yates-Grundy estimators studied and performed better than the Horvitz
-Thompson variance estimator computed with the Hartley-Rao approximati
on. In NSS and EMAP most response variables are not expected to be hig
hly correlated with the selection variable, so upsilon(HT)o should fur
nish an adequate variance approximation when the randomized population
model holds.