D. Jouanrimbaud et al., RANDOM CORRELATION IN VARIABLE SELECTION FOR MULTIVARIATE CALIBRATIONWITH A GENETIC ALGORITHM, Chemometrics and intelligent laboratory systems, 35(2), 1996, pp. 213-220
The importance of the validation step in multiple linear regression of
near-infrared spectroscopic data, after selection of wavelengths by a
genetic algorithm, is investigated with the use of random variables.
It is shown that in spite of a careful validation procedure, the GA ca
n still select irrelevant variables. The effect is greatly reduced by
applying a forward selection in the subsets selected by the genetic al
gorithm.