Xw. Lin et al., Smoothing spline ANOVA models for large data sets with Bernoulli observations and the randomized GACV, ANN STATIST, 28(6), 2000, pp. 1570-1600
We propose the randomized Generalized Approximate Cross Validation (ranGACV
) method for choosing multiple smoothing parameters in penalized likelihood
estimates for Bernoulli data. The method is intended for application with
penalized likelihood smoothing spline ANOVA models. In addition we propose
a class of approximate numerical methods for solving the penalized likeliho
od variational problem which, in conjunction with the ranGACV method allows
the application of smoothing spline ANOVA models with Bernoulli data to mu
ch larger data sets than previously possible. These methods are based on ch
oosing an approximating subset of the natural (representer) basis functions
for the variational problem. Simulation studies with synthetic data, inclu
ding synthetic data mimicking demographic risk factor data sets is used to
examine the properties of the method and to compare the approach with the G
RKPACK code of Wang (1997c). Bayesian "confidence intervals" are obtained f
or the fits and are shown in the simulation studies to have the "across the
function" property usually claimed for these confidence intervals. Finally
the method is applied to an observational data set from the Beaver Dam Eye
study, with scientifically interesting results.