Multivariate strategies for classification based on NIR-spectra - with application to mayonnaise

Citation
Ug. Indahl et al., Multivariate strategies for classification based on NIR-spectra - with application to mayonnaise, CHEM INTELL, 49(1), 1999, pp. 19-31
Citations number
35
Categorie Soggetti
Spectroscopy /Instrumentation/Analytical Sciences
Journal title
CHEMOMETRICS AND INTELLIGENT LABORATORY SYSTEMS
ISSN journal
01697439 → ACNP
Volume
49
Issue
1
Year of publication
1999
Pages
19 - 31
Database
ISI
SICI code
0169-7439(19990906)49:1<19:MSFCBO>2.0.ZU;2-7
Abstract
The goal of the presented study is two-fold. First, we want to emphasize th e power of Near Infrared Reflectance (NIR) spectroscopy for discrimination between mayonnaise samples containing different vegetable oils. Secondly, w e want to use our data to compare the performances of different classificat ion procedures. The Nm spectra with 351 variables correspond to equally spa ced wavelengths in the 1100-2500 nm area. Feature extraction both by automa tic wavelength-selection and by projection onto principal components (PCs) is discussed. The discriminant methods considered are linear discriminant a nalysis (LDA), quadratic discriminant analysis (QDA) and regression with ca tegorical {0,1}-responses. A dataset containing 162 spectra of mayonnaise s amples based on six different vegetable oils is analyzed. By LDA with authe ntic cross-validation (PC-models re-estimated for each cross-validation seg ment), only one sample was misclassified. Classification by allocating a sa mple according to the largest fitted value of a Linear regression (Discrimi nant-Partial least squares (DPLS) or Discriminant-Principal components regr ession (DPCR)) is demonstrated sub-optimal compared to LDA of the correspon ding PLS- or PCR-scores. QDA significantly outperforms LDA for projections of the data onto subspaces of moderate size (scores of 7-9 PCs). Two automa tic variable-selection procedures choose 16 and 26 wavelengths (variables), respectively from the spectra. Based on the selected wavelengths, LDA give s considerably better classification than the regression approach. By repor ting the performances of several feature extraction techniques in tandem wi th three of the most common classification methods, we hope that the reader will notice two relevant aspects: (1) By using the DPLS and DPCR (classifi cation by 'dummy' regressions) one is exposed to a significant risk of obta ining sub-optimal classification results; (2) The automatic wavelength sele ctions may give valuable information about what is actually causing a succe ssful discrimination. Such knowledge can, for instance, be used to select t he most suited filters for online applications of NIR. Besides, from demons trating different classification strategies, our study clearly shows that c lassification methods with NIR spectra can be used to discriminate between mayonnaise samples of different oil types and fatty acid composition. (C) 1 999 Elsevier Science B.V. All rights reserved.