In terms of the Born-Oppenheimer approach to chemistry, molecules may
be represented as three-dimensional objects, and their structures may
be compared: this comparison is of crucial importance to successful mo
lecular modelling. Furthermore, molecular modelling techniques presupp
ose that a suitable molecular representation is available. This paper
outlines various methods for representing molecular structure, such as
positional coordinates - an example of which are crystallographic coo
rdinates - and internal parameters such as bond distances and angles.
The similarity between two molecules may be established by maximally s
uperimposing them, and then determining the extent to which they canno
t be brought to coincide. This is the approach used in many molecular
modelling programs. Alternatively, instead of viewing the molecules as
objects embedded in three-dimensional space, they can also be thought
of as representative points within a hyperdimensional space spanned b
y a set of 3N-6 independent geometric coordinates that define the stru
cture of the molecule (N is the number of atoms). In this case, the si
milarity between two molecules can be expressed in terms of the distan
ce between their representative points. Preferred conformations may be
identified by examining the distributions of representative points fo
r closely related molecules within these hyperdimensional spaces. Howe
ver, in order to compare ensembles of molecules, multidimensional stat
istical techniques have had to be employed. The methods of principal c
omponent and cluster analysis are described and are illustrated by mea
ns of simple two-dimensional examples. Finally, two examples are taken
from the chemical literature to demonstrate how the multivariate stat
istical methods can be applied in practice, and what type of results t
hey yield.