Mass spectral classifiers for 15 substructures have been computed. that giv
e discrete present/absent answers. For the development of classifiers, line
ar discriminant analysis (LDA) and partial least squares discriminant PLS (
DPLS) have been used. The low resolution mass spectra were transformed into
a set of 400 spectral features. Because each spectrum is described with so
many features, some features may not be necessary, and others may contribu
te only noise. Therefore, the effect of feature selection has been investig
ated. The methods used were selection by Fisher ratios and selection by a g
enetic algorithm (GA). The first method is univariate, the second is multiv
ariate; advantages and disadvantages of both are discussed. On the average,
feature selection did not significantly change the classification performa
nce compared with results that have been obtained with all features. Howeve
r, it was possible to reduce the number of features considerably without a
loss of classification performance. For a few substructures GA together wit
h LDA resulted in much better classifiers than DPLS with all features. The
features selected for classifications of a benzyl substructure and for the
presence of chlorine have been interpreted in terms of mass spectrometric f
ragmentation rules. (C) 2001 Elsevier Science B.V. All rights reserved.