Variable selection in classification of environmental soil samples for partial least square and neural network models

Citation
Z. Ramadan et al., Variable selection in classification of environmental soil samples for partial least square and neural network models, ANALYT CHIM, 446(1-2), 2001, pp. 233-244
Citations number
46
Categorie Soggetti
Spectroscopy /Instrumentation/Analytical Sciences
Journal title
ANALYTICA CHIMICA ACTA
ISSN journal
00032670 → ACNP
Volume
446
Issue
1-2
Year of publication
2001
Pages
233 - 244
Database
ISI
SICI code
0003-2670(20011119)446:1-2<233:VSICOE>2.0.ZU;2-2
Abstract
Two variable selection methods were evaluated by comparing their prediction s with respect to differentiating among environmental soil samples. The foc us of this work is to determine which input variables are most relevant for prediction of soil sources using discriminant partial least square (D-PLS) and back-propagation artificial neural network (BP-ANN) models. The method s investigated were stepwise variable selection method and genetic algorith ms (GAs). Microbial community DNA was extracted from 48 environmental soil samples derived, from different field crops and soil sources. After amplifi cation of bacterial ribosomal RNA genes by polymerase chain reaction (PCR), the products were separated by gel electrophoresis. Characteristic complex band patterns were obtained, indicating high bacterial diversity. Two hund red and twenty-three, DNA band patterns produced in the gels of the soil sa mples were used in the analysis, after removal of included DNA standard mar kers. Based on the brightness of the bands, densitometric curves of the sel ected DNA band pattern were extracted from the gel images. The curves were smoothed using Savitsky-Golay method band scaled to the DNA standard marker s. The prediction results based on the two variable selection methods for P LS and ANN models are presented and compared. Both methods gave good result s before any variable selection methods, with the ANN being better than D-P LS. The prediction performance of both methods specially the D-PLS were imp roved by applying the stepwise variable selection and the GA variable selec tion method. The study also shows that GA variable selection had a signific ant improvement of the predictive ability than the stepwise variable select ion method. (C) 2001 Elsevier Science B.V. All rights reserved.