Db. Turner et P. Willett, Evaluation of the EVA descriptor for QSAR studies: 3. The use of a geneticalgorithm to search for models with enhanced predictive properties (EVA_GA), J COMPUT A, 14(1), 2000, pp. 1-21
The EVA structural descriptor, based upon calculated fundamental molecular
vibrational frequencies, has proved to be an effective descriptor for both
QSAR and database similarity calculations. The descriptor is sensitive to 3
D structure but has an advantage over field-based 3D-QSAR methods inasmuch
as structural superposition is not required. The original technique involve
s a standardisation method wherein uniform Gaussians of fixed standard devi
ation (sigma) are used to smear out frequencies projected onto a linear sca
le. The smearing function permits the overlap of proximal frequencies and t
hence the extraction of a fixed dimensional descriptor regardless of the nu
mber and precise values of the frequencies. It is proposed here that there
exist optimal localised values of sigma in different spectral regions; that
is, the overlap of frequencies using uniform Gaussians may, at certain poi
nts in the spectrum, either be insufficient to pick up relationships where
they exist or mix up information to such an extent that significant correla
tions are obscured by noise. A genetic algorithm is used to search for opti
mal localised sigma values using crossvalidated PLS regression scores as th
e fitness score to be optimised. The resultant models were then validated a
gainst a previously unseen test set of compounds and through data scramblin
g. The performance of EVA_GA is compared to that of EVA and analogous CoMFA
studies; in the latter case a brief evaluation is made of the effect of gr
id resolution upon the stability of CoMFA PLS scores particularly in relati
on to test set predictions.