The development of a genetic algorithm (GA) for pattern recognition analysi
s of pyrolysis gas chromatographic data is reported. The GA selects feature
s that optimize the separation of the classes in a plot of the two largest
principal components (PCs) of the data. Because the largest PCs capture the
bulk of the variance in the data, the peaks chosen by the GA convey inform
ation primarily about differences between the classes in the data set. Henc
e, the principal component analysis routine embedded in the fitness functio
n of the GA acts as an information filter, significantly reducing the size
of the search space, since it restricts the search to feature sets whose PC
plots show clustering on the basis of class. In addition, the algorithm ca
n focus on those classes and or samples that are difficult to classify as i
t trains using a form of boosting. Samples that consistently classify corre
ctly are not as heavily weighted in the analysis as samples that are diffic
ult to classify. Over time, the algorithm learns its optimal parameters in
a manner similar to a neural network. The proposed algorithm integrates asp
ects of artificial intelligence and evolutionary computations to yield a 's
mart' one-pass procedure for pattern recognition. The efficacy and efficien
cy of the pattern recognition GA is demonstrated using a data set consistin
g of 133 pyrochromatograms of cultured skin fibroblasts obtained from 24 ob
ligate cystic fibrosis homozygotes and from 22 normal controls. (C) 1999 El
sevier Science B.V. All rights reserved.