L. Pogliani, MODELING PURINES AND PYRIMIDINES WITH THE LINEAR COMBINATION OF CONNECTIVITY INDEXES - MOLECULAR CONNECTIVITY LCCI-MC METHOD, Journal of chemical information and computer sciences, 36(6), 1996, pp. 1082-1091
Citations number
36
Categorie Soggetti
Information Science & Library Science","Computer Application, Chemistry & Engineering","Computer Science Interdisciplinary Applications",Chemistry,"Computer Science Information Systems
A series of five experimental properties of DNA-RNA bases (U, T, A, G,
and C): singlet excitation energies Delta E(1) and Delta E(2), oscill
ator strengths f(1) and f(2), and molar absorption coefficient epsilon
(260) plus three experimental properties of a wider set of purine and
pyrimidine bases: average (pK) = 1/2 (pK(a) + pK(b)), molecular weight
s MW, and solubility have been simulated in two different ways with li
near combinations of connectivity indices (LCCI) chosen from a medium
sized molecular connectivity {chi} = {D,D-v,(0) chi,(0) chi(v),(1) chi
,(1) chi(v),chi(t),chi(t)(v)} set. A forward selection technique and a
full combinatorial space technique have been used to choose the best
linear combination of connectivity indices for an optimal modeling. Th
e given properties are very well modeled with the only exception being
(pK), whose modeling could be improved with the introduction of fragm
ent reciprocal connectivity indices, that take into account the number
of basic and acid groups of the given molecules. The (easier to perfo
rm) forward selection technique is in many occasions a good alternativ
e to the more cumbersome full space selection technique and can normal
ly be used to restrict the dimension of the full combinatorial space,
thus, facilitating the computation. Limits in the forward selection me
thod can frequently be overcome with the introduction of orthogonal in
dices. While the simulation of the molecular weights cast some light o
n the modeling of hydrogen-rich or -poor molecules, the simulation of
the solubility shows (i) how far a satisfactory modeling of a small nu
mber of compounds can be extrapolated by the aid of the same indices t
o a wider set, (ii) the importance of linear combinations of squared c
onnectivity indices used in the absolute value mode, and (iii) the con
tribution of supramolecular connectivity indices for an improved model
ing of the solubility. The positive role of the chi(t) and chi(1)(v) i
ndices, all along the modeling of the different properties, seems to b
e due to the rather low collinearity of these indices relative to the
other indices of the connectivity set, a fact that underlines their us
e in molecular modeling with linear combinations of connectivity indic
es. In an Appendix, the notation of delta cardinal number is extended
to the triplet code words to generate the different families and subfa
milies of the genetic code.