Neural network modeling for estimation of partition coefficient based on atom-type electrotopological state indices

Citation
Jj. Huuskonen et al., Neural network modeling for estimation of partition coefficient based on atom-type electrotopological state indices, J CHEM INF, 40(4), 2000, pp. 947-955
Citations number
37
Categorie Soggetti
Chemistry
Journal title
JOURNAL OF CHEMICAL INFORMATION AND COMPUTER SCIENCES
ISSN journal
00952338 → ACNP
Volume
40
Issue
4
Year of publication
2000
Pages
947 - 955
Database
ISI
SICI code
0095-2338(200007/08)40:4<947:NNMFEO>2.0.ZU;2-5
Abstract
A method fur predicting log P values for a diverse set of 1870 organic mole cules has been developed based on atom-type electrotopological-state (E-sta te) indices and neural network modeling. An extended set of E-state indices , which included specific indices with a more detailed description of amino , carbonyl, and hydroxy groups, was used in the current study. For the trai ning set of 1754 molecules the squared correlation coefficient and root-mea n-squared error were r(2) = 0.90 and RMSLOO = 0.46, respectively. Structura l parameters which included molecular weight and 38 atom-type E-state indic es were used as the inputs in 39-5-1 artificial neural networks. The result s from multilinear regression analysis were r(2) = 0.87 and RMSLOO = 0.55, respectively. For a test set of 35 nucleosides, 12 nucleoside bases, 19 dru g compounds, and 50 general organic compounds (n = 116)not included in the training set, a predictive r(2) = 0.94 and RMS = 0.41 were calculated by ar tificial neural networks. The results for the same set by multilinear regre ssion were r(2) = 0.86 and RMS = 0.72. The improved prediction ability of a rtificial neural networks can be attributed to the nonlinear properties of this method that allowed the detection of high-order relationships between E-state indices and the n-octanol/water partition coefficient. The present approach was found to be an accurate and fast method that can be used for t he reliable estimation of log P values for even the most complex structures .