Feature selection by genetic algorithms for mass spectral classifiers

Citation
H. Yoshida et al., Feature selection by genetic algorithms for mass spectral classifiers, ANALYT CHIM, 446(1-2), 2001, pp. 485-494
Citations number
33
Categorie Soggetti
Spectroscopy /Instrumentation/Analytical Sciences
Journal title
ANALYTICA CHIMICA ACTA
ISSN journal
00032670 → ACNP
Volume
446
Issue
1-2
Year of publication
2001
Pages
485 - 494
Database
ISI
SICI code
0003-2670(20011119)446:1-2<485:FSBGAF>2.0.ZU;2-5
Abstract
Mass spectral classifiers for 15 substructures have been computed. that giv e discrete present/absent answers. For the development of classifiers, line ar discriminant analysis (LDA) and partial least squares discriminant PLS ( DPLS) have been used. The low resolution mass spectra were transformed into a set of 400 spectral features. Because each spectrum is described with so many features, some features may not be necessary, and others may contribu te only noise. Therefore, the effect of feature selection has been investig ated. The methods used were selection by Fisher ratios and selection by a g enetic algorithm (GA). The first method is univariate, the second is multiv ariate; advantages and disadvantages of both are discussed. On the average, feature selection did not significantly change the classification performa nce compared with results that have been obtained with all features. Howeve r, it was possible to reduce the number of features considerably without a loss of classification performance. For a few substructures GA together wit h LDA resulted in much better classifiers than DPLS with all features. The features selected for classifications of a benzyl substructure and for the presence of chlorine have been interpreted in terms of mass spectrometric f ragmentation rules. (C) 2001 Elsevier Science B.V. All rights reserved.