Using principal component regression (PCR) as a multivariate calibration to
ol, always brings up the question what subset of factors, i.e. principal co
mponents (PCs) gives the best calibration model. Normally factor selection
is based on deterministic methods like top-down procedures, forward-backwar
d-stepwise variable selection or correlated principal component regression
(CPCR). In contrast to this, we applied a stochastic method, i.e. a genetic
algorithm (GA) for factor selection in this paper. A new kind of fitness f
unction was applied which combined the prediction error of the calibration
and an independent validation set, The performance of eigenvalue and correl
ation ranking was compared. A general statistical criterion for judging the
significance of differences between individual calibration models is intro
duced. In this context it could be shown that for the uncertainties of the
standard deviations representing the prediction errors a very simple approx
imation formula holds which only includes the number of standards, For the
current applications it is shown that the GA gives a result very close ro C
PCR-solutions. (C) 2000 Elsevier Science B.V. All rights reserved.