Ph. Kvam et al., Nonparametric Bayes estimation of contamination levels using observations from the residual distribution, J AM STAT A, 95(452), 2000, pp. 1119-1126
Data on contamination concentrations for chromium from one of the EPA's tox
ic waste sites consist of independent and identically distributed (iid) mea
surements along with additional observations from the residual distribution
. The residual sample is obtained by sampling from hot spots, In where cont
amination concentrations are assumed to be above a given threshold value. T
he data are modeled using a nonparametric Bayes estimator of the distributi
on function. The Dirichlet process is used to formulate prior information a
bout the chromium contamination, and we compare the Bayes estimator of the
mean concentration level to other estimators currently considered by the EP
A and other sources. The Bayes estimator of the mean generally outperforms
competing estimators under various cost functions. The Bayes estimator of t
he distribution function is derived assuming the possibility of right-censo
red contamination measurements along with left-truncated hot spot data. For
the case in which the prior becomes noninformative, the Bayes estimator of
the distribution function is the nonparametric maximum likelihood estimato
r, which is identical to the Kaplan-Meier estimator for concentration value
s observed below the residual sample threshold. Robustness of the Bayes est
imator is examined with respect to misspecification of the prior and its se
nsitivity to the censoring distribution.