The dangers of creating false classifications due to noise in electronic nose and similar multivariate analyses

Citation
Kl. Goodner et al., The dangers of creating false classifications due to noise in electronic nose and similar multivariate analyses, SENS ACTU-B, 80(3), 2001, pp. 261-266
Citations number
10
Categorie Soggetti
Spectroscopy /Instrumentation/Analytical Sciences","Instrumentation & Measurement
Journal title
SENSORS AND ACTUATORS B-CHEMICAL
ISSN journal
09254005 → ACNP
Volume
80
Issue
3
Year of publication
2001
Pages
261 - 266
Database
ISI
SICI code
0925-4005(200112)80:3<261:TDOCFC>2.0.ZU;2-O
Abstract
Randomly generated data with the error limits of 1-10% along with experimen tal data was employed to demonstrate the dangers of over-fitting data which creates artificial differentiation, Analysis of variance (ANOVA), principa l components analysis (PCA), and discriminant function analysis (DFA) were employed for the data analysis. In cases, where the ratio of samples to var iables (features) falls below six, single class systems containing only ran dom noise and random groupings can be misclassified. into more than a singl e group when the discriminate techniques are employed. The smaller the grou p size, the more erroneous classifications are made. Larger sample sizes mi nimize the random noise and allow the true differences to show. A minimum n umber of variable (features) should be employed with developing classificat ion models to avoid over-fitting data. The ratio of data points to variable s should be at least six to avoid over-fitting classification errors with v alidation of the model using data points not used in generating the model. (C) 2001 Elsevier Science B.V. All rights reserved.