Evaluation of the EVA descriptor for QSAR studies: 3. The use of a geneticalgorithm to search for models with enhanced predictive properties (EVA_GA)

Citation
Db. Turner et P. Willett, Evaluation of the EVA descriptor for QSAR studies: 3. The use of a geneticalgorithm to search for models with enhanced predictive properties (EVA_GA), J COMPUT A, 14(1), 2000, pp. 1-21
Citations number
26
Categorie Soggetti
Chemistry & Analysis
Journal title
JOURNAL OF COMPUTER-AIDED MOLECULAR DESIGN
ISSN journal
0920654X → ACNP
Volume
14
Issue
1
Year of publication
2000
Pages
1 - 21
Database
ISI
SICI code
0920-654X(200001)14:1<1:EOTEDF>2.0.ZU;2-Z
Abstract
The EVA structural descriptor, based upon calculated fundamental molecular vibrational frequencies, has proved to be an effective descriptor for both QSAR and database similarity calculations. The descriptor is sensitive to 3 D structure but has an advantage over field-based 3D-QSAR methods inasmuch as structural superposition is not required. The original technique involve s a standardisation method wherein uniform Gaussians of fixed standard devi ation (sigma) are used to smear out frequencies projected onto a linear sca le. The smearing function permits the overlap of proximal frequencies and t hence the extraction of a fixed dimensional descriptor regardless of the nu mber and precise values of the frequencies. It is proposed here that there exist optimal localised values of sigma in different spectral regions; that is, the overlap of frequencies using uniform Gaussians may, at certain poi nts in the spectrum, either be insufficient to pick up relationships where they exist or mix up information to such an extent that significant correla tions are obscured by noise. A genetic algorithm is used to search for opti mal localised sigma values using crossvalidated PLS regression scores as th e fitness score to be optimised. The resultant models were then validated a gainst a previously unseen test set of compounds and through data scramblin g. The performance of EVA_GA is compared to that of EVA and analogous CoMFA studies; in the latter case a brief evaluation is made of the effect of gr id resolution upon the stability of CoMFA PLS scores particularly in relati on to test set predictions.