ITA
ENG

Smoothing spline ANOVA models for large data sets with Bernoulli observations and the randomized GACV

Authors

Lin, XW Wahba, G Xiang, D Gao, FY Klein, R Klein, B

Citation

Xw. Lin et al., Smoothing spline ANOVA models for large data sets with Bernoulli observations and the randomized GACV, ANN STATIST, 28(6), 2000, pp. 1570-1600

Citations number

Categorie Soggetti

Mathematics

Journal title

ANNALS OF STATISTICS

ISSN journal

00905364 → ACNP

Volume

Issue

Year of publication

2000

Pages

1570 - 1600

Database

ISI

SICI code

0090-5364(200012)28:6<1570:SSAMFL>2.0.ZU;2-O

Abstract

We propose the randomized Generalized Approximate Cross Validation (ranGACV ) method for choosing multiple smoothing parameters in penalized likelihood estimates for Bernoulli data. The method is intended for application with penalized likelihood smoothing spline ANOVA models. In addition we propose a class of approximate numerical methods for solving the penalized likeliho od variational problem which, in conjunction with the ranGACV method allows the application of smoothing spline ANOVA models with Bernoulli data to mu ch larger data sets than previously possible. These methods are based on ch oosing an approximating subset of the natural (representer) basis functions for the variational problem. Simulation studies with synthetic data, inclu ding synthetic data mimicking demographic risk factor data sets is used to examine the properties of the method and to compare the approach with the G RKPACK code of Wang (1997c). Bayesian "confidence intervals" are obtained f or the fits and are shown in the simulation studies to have the "across the function" property usually claimed for these confidence intervals. Finally the method is applied to an observational data set from the Beaver Dam Eye study, with scientifically interesting results.