We have used an improved block-entropy measure in order to gain some f
urther insights into the short-range correlations present in whole chr
omosomes of S. cerevisiae, viruses and organelles and very large genom
ic regions of E. coli. Although DNA sequences are largely inhomogeneou
s and word frequencies are unevenly distributed, the comparison of ent
ire chromosomes and large genomic regions show a ''bulk'' composition
homogeneity. This property suggests that biases in selection, directio
nal mutational pressure and recombination processes act in homogenizin
g the base composition of the DNA molecules within a genome but their
mode of action, relative impact and direction may vary in different or
ganisms. The most interesting results appear to be the differences bet
ween the SW (C,G/A,T) and RY (A,G/C,T) two-letter alphabet entropies.
Deviations from randomness in E. coil and S. cerevisiae sequences part
icularly concern SW dinucleotide frequencies and RY tetranucleotide fr
equencies. (C) 1996 Academic Press Limited