GENETIC ALGORITHMS AS A METHOD FOR VARIABLE SELECTION IN MULTIPLE LINEAR-REGRESSION AND PARTIAL LEAST-SQUARES REGRESSION, WITH APPLICATIONSTO PYROLYSIS MASS-SPECTROMETRY

Citation
D. Broadhurst et al., GENETIC ALGORITHMS AS A METHOD FOR VARIABLE SELECTION IN MULTIPLE LINEAR-REGRESSION AND PARTIAL LEAST-SQUARES REGRESSION, WITH APPLICATIONSTO PYROLYSIS MASS-SPECTROMETRY, Analytica chimica acta, 348(1-3), 1997, pp. 71-86
Citations number
39
Categorie Soggetti
Chemistry Analytical
Journal title
ISSN journal
00032670
Volume
348
Issue
1-3
Year of publication
1997
Pages
71 - 86
Database
ISI
SICI code
0003-2670(1997)348:1-3<71:GAAAMF>2.0.ZU;2-Y
Abstract
Four optimising methods for variable selection in multivariate calibra tion have been described: one for determining the optimal subset of va riables to give the best possible root-mean-square error of prediction (RMSEP) in a multiple linear regression (MLR) model, the second for d etermining the optimal subset of variables which produce a model with RMSEP less than or equal to a given value. Algorithms three and four w ere identical to algorithms one and two, respectively, except that thi s time they use a cost function derived from a partial least squares ( PLS) model rather than an MLR model. Applied to a typical set of pyrol ysis mass spectrometry data the first variable selection method is sho wn to reduce the RMSEP of the optimal MLR or PLS model significantly w hen the number of variables is decreased by approximately half. Altern atively, the number of variables may be reduced substantially (> 10-fo ld) with no loss in RMSEP.