Bd. Gute et al., Molecular similarity based estimation of properties: A comparison of structure spaces and property spaces, SAR QSAR EN, 11(5-6), 2001, pp. 363-382
Molecular similarity methods have emerged as powerful tools in analog selec
tion, chemical classification based on toxic modes of action, and property
estimation. The basic assumption of structure-activity relationships (SAR)
is that similar structures usually have similar properties. Therefore, simi
larity methods can be used for the selection of analogs and estimation of p
roperties of chemicals from their structural analogs in property spaces.
Each similarity method is user defined. Its efficacy depends on the set of
descriptors used to define the intermolecular similarity of chemicals as we
ll as on the mathematical function used to quantify similarity. Also, simil
arity methods can be based on experimental data or computed molecular descr
iptors.
We have carried out a comparative study of similarity spaces derived from e
xperimental data vis-a-vis computed structural parameters for two sets of c
hemicals: (a) a diverse set of 76 chemicals derived from the TSCA Inventory
and (b) the 166 structurally distinct constituents of JP-8 identified by G
C/MS. Property spaces for these two sets of chemicals were created using ex
perimental and calculated physicochemical properties. Atom pairs (APs) and
topological indices calculated by POLLY v2.3 were used to create theoretica
l structure spaces. These spaces were used for the KNN-based estimation of
properties with K = 1-10, 15, 20, 25. The results will be presented with a
comparative analysis of the effectiveness of property spaces and structure
spaces in analog selection and property estimation.