Interrelationships of multivariate regression methods using eigenvector basis sets

Authors
Citation
Jh. Kalivas, Interrelationships of multivariate regression methods using eigenvector basis sets, J CHEMOMETR, 13(2), 1999, pp. 111-132
Citations number
27
Categorie Soggetti
Spectroscopy /Instrumentation/Analytical Sciences
Journal title
JOURNAL OF CHEMOMETRICS
ISSN journal
08869383 → ACNP
Volume
13
Issue
2
Year of publication
1999
Pages
111 - 132
Database
ISI
SICI code
0886-9383(199903/04)13:2<111:IOMRMU>2.0.ZU;2-L
Abstract
This paper provides an expository discussion of the interrelationships betw een least squares (LS), principal component regression (PCR), partial least squares (PLS), ridge regression (RR), generalized ridge regression (GRR), continuum regression (CR) and cyclic subspace regression (CSR) for the line ar model y = Xb + e. Developed in this paper is continuum CSR (CCSR). From this study it is ascertained that GRR encompasses LS, PCR, PLS, RR, CR, CSR and CCSR. It is shown that a regression vector, regardless of its source, can be written as a linear combination of the vi eigenvectors obtained from a singular value decomposition (SVD) of X, i.e. X = U Sigma V-T. Similarly , it is shown that calibration fitted values (y) over cap obtained from any linear regression method can be written as a linear combination of the u(i ) eigenvectors obtained from an SVD of X. Formulae are provided to compute phi and y, respective vectors of weights for v(i) and u(i) eigenvectors. It is recommended that the phi eigenvector weights be inspected to ascertain exactly what information is being used to form the regression vector for th e particular modeling approach used. Analogously, the gamma eigenvector wei ghts should be inspected to determine what information is being used to for m calibration fitted values. Besides assisting in prediction rank determina tion, both eigenvector weight plots also allow for easy comparison of model s built by different methods, e.g, the PCR model versus the PLS model. It i s shown that it is not the number of factors used to build a PCR or PLS mod el that is important, but the number of eigenvectors used, which ones, and how they are weighted to form respective regression vectors and fitted valu es of calibration samples. In essence, how eigenvectors are weighted dictat es which GRR model is formed. From the CR, CSR and RR eigenvector weight pl ots of phi it is concluded that the optimal model will most often have a co mbination of PCR and PLS attributes. Copyright (C) 1999 John Wiley & Sons, Ltd.