PREDICTION VIA ORTHOGONALIZED MODEL MIXING

Citation
M. Clyde et al., PREDICTION VIA ORTHOGONALIZED MODEL MIXING, Journal of the American Statistical Association, 91(435), 1996, pp. 1197-1208
Citations number
31
Categorie Soggetti
Statistic & Probability","Statistic & Probability
Volume
91
Issue
435
Year of publication
1996
Pages
1197 - 1208
Database
ISI
SICI code
Abstract
We introduce an approach and algorithms for model mixing in large pred iction problems with correlated predictors. We focus on the choice of predictors in Linear models, and mix over possible subsets of candidat e predictors. Our approach is based on expressing the space of models in terms of an orthogonalization of the design matrix. Advantages are both statistical and computational. Statistically, orthogonalization o ften leads to a reduction in the number of competing models by elimina ting correlations. Computationally, large model spaces cannot be enume rated; recent approaches are based on sampling models with high poster ior probability via Markov chains. Based on orthogonalization of the s pace of candidate predictors, we can approximate the posterior probabi lities of models by products of predictor-specific terms. This leads t o an importance sampling function for sampling directly from the joint distribution over the model space, without resorting to Markov chains . Compared to the latter, orthogonalized model mixing by importance sa mpling is faster in sampling models and is also more efficient in find ing models that contribute significantly to the prediction. Further ad vantages are in the speed of convergence and the availability of more reliable convergence diagnostic tools. We illustrate these in practice , using a data set on prediction of crime rates. The model space is sm all enough so that enumeration of all models is available for comparis on and convergence checks. Also, we demonstrate the feasibility of ort hogonalized model mixing in a large-size problem, which is very diffic ult to attack by other methods. The data set is from a designed experi ment dealing with predicting protein activity under different storage conditions. The model space is large (the rank of the design matrix is 88) and very difficult to explore if expressed in terms of the origin al variables. We obtain prediction intervals and a probability distrib ution of the setting that produces the highest response.