GENETIC ALGORITHMS AS A METHOD FOR VARIABLE SELECTION IN MULTIPLE LINEAR-REGRESSION AND PARTIAL LEAST-SQUARES REGRESSION, WITH APPLICATIONSTO PYROLYSIS MASS-SPECTROMETRY
D. Broadhurst et al., GENETIC ALGORITHMS AS A METHOD FOR VARIABLE SELECTION IN MULTIPLE LINEAR-REGRESSION AND PARTIAL LEAST-SQUARES REGRESSION, WITH APPLICATIONSTO PYROLYSIS MASS-SPECTROMETRY, Analytica chimica acta, 348(1-3), 1997, pp. 71-86
Four optimising methods for variable selection in multivariate calibra
tion have been described: one for determining the optimal subset of va
riables to give the best possible root-mean-square error of prediction
(RMSEP) in a multiple linear regression (MLR) model, the second for d
etermining the optimal subset of variables which produce a model with
RMSEP less than or equal to a given value. Algorithms three and four w
ere identical to algorithms one and two, respectively, except that thi
s time they use a cost function derived from a partial least squares (
PLS) model rather than an MLR model. Applied to a typical set of pyrol
ysis mass spectrometry data the first variable selection method is sho
wn to reduce the RMSEP of the optimal MLR or PLS model significantly w
hen the number of variables is decreased by approximately half. Altern
atively, the number of variables may be reduced substantially (> 10-fo
ld) with no loss in RMSEP.