We introduce an approach and algorithms for model mixing in large pred
iction problems with correlated predictors. We focus on the choice of
predictors in Linear models, and mix over possible subsets of candidat
e predictors. Our approach is based on expressing the space of models
in terms of an orthogonalization of the design matrix. Advantages are
both statistical and computational. Statistically, orthogonalization o
ften leads to a reduction in the number of competing models by elimina
ting correlations. Computationally, large model spaces cannot be enume
rated; recent approaches are based on sampling models with high poster
ior probability via Markov chains. Based on orthogonalization of the s
pace of candidate predictors, we can approximate the posterior probabi
lities of models by products of predictor-specific terms. This leads t
o an importance sampling function for sampling directly from the joint
distribution over the model space, without resorting to Markov chains
. Compared to the latter, orthogonalized model mixing by importance sa
mpling is faster in sampling models and is also more efficient in find
ing models that contribute significantly to the prediction. Further ad
vantages are in the speed of convergence and the availability of more
reliable convergence diagnostic tools. We illustrate these in practice
, using a data set on prediction of crime rates. The model space is sm
all enough so that enumeration of all models is available for comparis
on and convergence checks. Also, we demonstrate the feasibility of ort
hogonalized model mixing in a large-size problem, which is very diffic
ult to attack by other methods. The data set is from a designed experi
ment dealing with predicting protein activity under different storage
conditions. The model space is large (the rank of the design matrix is
88) and very difficult to explore if expressed in terms of the origin
al variables. We obtain prediction intervals and a probability distrib
ution of the setting that produces the highest response.