Iv. Tetko et al., NEURAL-NETWORK STUDIES .2. VARIABLE SELECTION, Journal of chemical information and computer sciences, 36(4), 1996, pp. 794-803
Citations number
45
Categorie Soggetti
Information Science & Library Science","Computer Application, Chemistry & Engineering","Computer Science Interdisciplinary Applications",Chemistry,"Computer Science Information Systems
Quantitative structure-activity relationship (QSAR) studies usually re
quire an estimation of the relevance of a very large set of initial va
riables. Determination of the most important variables allows theoreti
cally a better generalization by all pattern recognition methods. This
study introduces and investigates five pruning algorithms designed to
estimate the importance of input variables in feed-forward artificial
neural network trained by back propagation algorithm (ANN) applicatio
ns and to prune nonrelevant ones in a statistically reliable way. The
analyzed algorithms performed similar variable estimations for simulat
ed data sets, but differences were detected for real QSAR examples, Im
provement of ANN prediction ability was shown after the pruning of red
undant input variables. The statistical coefficients computed by ANNs
for QSAR examples were better than those of multiple linear regression
. Restrictions of the proposed algorithms and the potential use of ANN
s are discussed.