VARIABLES SELECTION USING THE WALD TEST AND A ROBUST C-P

Citation
S. Sommer et Rm. Huggins, VARIABLES SELECTION USING THE WALD TEST AND A ROBUST C-P, Applied Statistics, 45(1), 1996, pp. 15-29
Citations number
21
Categorie Soggetti
Statistic & Probability","Statistic & Probability
Journal title
ISSN journal
00359254
Volume
45
Issue
1
Year of publication
1996
Pages
15 - 29
Database
ISI
SICI code
0035-9254(1996)45:1<15:VSUTWT>2.0.ZU;2-Z
Abstract
A new variables selection criterion is presented. It is based on the W ald test statistic and is defined by Tp = Wp - K + 2p where K and p ar e the numbers of parameters in the full and submodel respectively, and Wp is the Wald statistic for testing whether the coefficients of the variables not in the submodel are O. 'Good' submodels will have Tp-val ues that are close to or smaller than p, and, as with Mallows's Cp, th ey will be selected by graphical rather than stepwise methods. We firs t consider an application to the linear regression of the heat evolved in a cement mix on four explanatory variables; we use robust methods and obtain the same results as those from the more computer-intensive methods of Ronchetti and Staudte. Our later applications are to previo usly published data sets which use logistic regression to predict part icipation in the US federal food stamp program, myocardial infarction and prostatic cancer. The first data set was shown in previous analysi s to contain an outlier and is considered for illustration. In the las t two data sets our criterion applied to the maximum likelihood estima tes selects the same model as do previously published stepwise analyse s. However, for the food stamp data set, the application of our criter ion using the robust logistic regression estimates of Carroll and Pede rson suggests more parsimonious models than those arising from the lik elihood analysis, and further suggests that interactions previously re garded as important may be due to outliers.