We present a detailed study of the variations of the boiling points am
ong 35 nonane isomers as an illustration of structure-property studies
. By restricting the attention to the molecules of a same size we have
effectively eliminated the dominant role of the molecular size in the
structure-property relationship which obscures the minor but importan
t variations in the properties with molecular branching. In this way w
e can better see the role of the molecular shape, that is, the variati
ons in the molecular branching, plays in governing the relative magnit
udes of molecular properties. We use the multiple regression analysis
as the method for the analysis of experimental data. We consider three
alternative ways of use of the regression analysis. As molecular desc
riptors we have adopted the connectivity indices 1chi-6chi and 0chi. H
owever, the approach outlined equally applies to other molecular descr
iptors, i.e., topological indices, quantum chemical parameters, as wel
l as to molecular properties as variables representing a molecule. We
start by considering the descriptors 0chi-6chi as a basis, i.e., we us
e the set of the descriptors in an in advance prescribed order. Becaus
e of interdependence of the descriptors (and often strong interdepende
nce of some descriptors) the so selected basis is nonorthogonal. Using
the connectivity indices 0chi6-chi we get the regression with R (the
coefficient of correlation) = 0.969 and S (the standard error) = 1.72.
A consequence of nonorthogonality is that inclusion or exclusion of a
single descriptor may dramatically alter the relative role that other
descriptors play. Hence, as our second alternative we outline the pro
cedure in which orthogonalized descriptors are used in structure-prope
rty regression. As a result the coefficients of the regression equatio
ns show stability, i.e., the coefficients do not change if regression
equation is truncated or a descriptor is added to the regression. The
standard error S and the coefficient of correlation R for the correlat
ion have not changed with the orthogonalization, i.e., are not affecte
d by the orthogonalization process. Finally, as the last alternative,
we illustrate use of a ''greedy'' algorithm approach, in which one sel
ects descriptor in a stepwise fashion. At each step interactively one
selects the descriptor that produces the smallest standard error in th
e structure-property regression. In this way we arrived at a regressio
n with the correlation coefficients of R = 0.968 and the standard erro
r of S = 1.69.