Am. Zaslavsky, COMBINING CENSUS, DUAL-SYSTEM, AND EVALUATION STUDY DATA TO ESTIMATE POPULATION SHARES, Journal of the American Statistical Association, 88(423), 1993, pp. 1092-1105
The 1990 census and Post-Enumeration Survey produced census and dual s
ystem estimates (DSE) of population by domain, together with an estima
ted sampling covariance matrix of the DSE. Estimates of the bias of th
e DSE were derived from various PES evaluation programs. Of the three
sources, the unadjusted census is the least variable but is believed t
o be the most biased, the DSE is less biased but more variable, and th
e bias estimates may be regarded as unbiased but are the most variable
. This article addresses methods for combining the census, the DSE, an
d bias estimates obtained from the evaluation programs to produce accu
rate estimates of population shares, as measured by weighted squared-
or absolute-error loss functions applied to estimated population share
s of domains. Several procedures are reviewed that choose between the
census and the DSE using the bias evaluation data or that average the
two with weights that are constant across domains. A multivariate hier
archical Bayes model is proposed for the joint distribution of the und
ercount rates and the biases of the DSE in the various domains. The sp
ecification of the model is sufficiently flexible to incorporate prior
information on factors likely to be associated with undercount and bi
as. When combined with data on undercount and bias estimates, the mode
l yields posterior distributions for the true population shares of eac
h domain. The performance of the estimators was compared through an ex
tensive series of simulations. The hierarchical Bayes procedures are s
hown to outperform the other estimators over a wide range of condition
s and to be robust against misspecification of the models. The various
composite estimators, applied to preliminary data from the 1990 Censu
s and evaluation programs. yield similar results that are closer to th
e DSE than to the census. Analysis of a revised data set yields qualit
atively similar estimates but shows that the revised post-stratificati
on improves on the original one.