Bayesian Variable Selection for High Dimensional Generalized Linear Models: Convergence Rates of the Fitted Densities

Authors
Citation
Jiang, Wenxin, Bayesian Variable Selection for High Dimensional Generalized Linear Models: Convergence Rates of the Fitted Densities, Annals of statistics , 35(4), 2007, pp. 1487-1511
Journal title
ISSN journal
00905364
Volume
35
Issue
4
Year of publication
2007
Pages
1487 - 1511
Database
ACNP
SICI code
Abstract
Bayesian variable selection has gained much empirical success recently in a variety of applications when the number K of explanatory variables $(x_{1},\ldots ,x_{K})$ is possibly much larger than the sample size n. For generalized linear models, if most of the $x_{j}\text{'}{\rm s}$ have very small effects on the response y, we show that it is possible to use Bayesian variable selection to reduce overfitting caused by the curse of dimensionality K >> n. In this approach a suitable prior can be used to choose a few out of the many $x_{j}\text{'}{\rm s}$ to model y, so that the posterior will propose probability densities p that are "often close" to the true density p* in some sense. The closeness can be described by a Hellinger distance between p and p* that scales at a power very close to $n^{-1/2}$, which is the "finite-dimensional rate" corresponding to a low-dimensional situation. These findings extend some recent work of Jiang [Technical Report 05-02 (2005) Dept. Statistics, Northwestern Univ.] on consistency of Bayesian variable selection for binary classification.