In order to improve the calibration speed for very large data sets, novel a
lgorithms for principal component regression (PCR) and partial-least-square
s (PLS) regression are presented. They use the Lanczos or PLS-1 transformat
ion to reduce the data matrix X to a small bidiagonal matrix (R), after whi
ch the small tridiagonal matrix (R'R) is diagonalized and inverted. The com
plexity of the PCR model may be optimized by cross-validation (PCRL) but al
so using simpler and faster recipes based upon sound-off monitoring and mod
el fit (PCRF). A similar fast PLS procedure (PLSF) is also presented. Calcu
lations are made for five near infrared spectroscopy (NIR) data sets and co
mpared with PCR with feature selection (PCRS) based on correlation and with
de Jong's simple partial least squares (SIMPLS). The Lanczos-based methods
have comparable prediction performance and similar model complexity to PCR
S and SIMPLS but are considerably faster. From a detailed comparison of the
methods, some insight is gained into the performance of the PLS method. (C
) 2000 Elsevier Science B.V. All rights reserved.