A multivariate quantitative physicochemical characterization of the fi
ve bases adenine (A), cytosine (C), guanine (G), thymine (T) and uraci
l (U), followed by principal component analysis, shows that the relati
ve dissimilarities between the bases of DNA (A, C, G and T) are almost
the same (i.e. balanced). In contrast, mRNA (containing U instead of
T) has a considerably larger relative physicochemical similarity betwe
en C and U than between all other pairs of bases and is therefore inhe
rently more unbalanced. These results provide a physicochemical explan
ation of the presence of thymine instead of uracil as an element of DN
A. The principal component scores enable a quantitative description of
nucleic acid sequence data to be made for structure-activity modellin
g purposes.