Smoothing spline ANOVA models for large data sets with Bernoulli observations and the randomized GACV

Citation
Xw. Lin et al., Smoothing spline ANOVA models for large data sets with Bernoulli observations and the randomized GACV, ANN STATIST, 28(6), 2000, pp. 1570-1600
Citations number
49
Categorie Soggetti
Mathematics
Journal title
ANNALS OF STATISTICS
ISSN journal
00905364 → ACNP
Volume
28
Issue
6
Year of publication
2000
Pages
1570 - 1600
Database
ISI
SICI code
0090-5364(200012)28:6<1570:SSAMFL>2.0.ZU;2-O
Abstract
We propose the randomized Generalized Approximate Cross Validation (ranGACV ) method for choosing multiple smoothing parameters in penalized likelihood estimates for Bernoulli data. The method is intended for application with penalized likelihood smoothing spline ANOVA models. In addition we propose a class of approximate numerical methods for solving the penalized likeliho od variational problem which, in conjunction with the ranGACV method allows the application of smoothing spline ANOVA models with Bernoulli data to mu ch larger data sets than previously possible. These methods are based on ch oosing an approximating subset of the natural (representer) basis functions for the variational problem. Simulation studies with synthetic data, inclu ding synthetic data mimicking demographic risk factor data sets is used to examine the properties of the method and to compare the approach with the G RKPACK code of Wang (1997c). Bayesian "confidence intervals" are obtained f or the fits and are shown in the simulation studies to have the "across the function" property usually claimed for these confidence intervals. Finally the method is applied to an observational data set from the Beaver Dam Eye study, with scientifically interesting results.