H. Herzel et I. Grosse, CORRELATIONS IN DNA-SEQUENCES - THE ROLE OF PROTEIN-CODING SEGMENTS, Physical review. E, Statistical physics, plasmas, fluids, and related interdisciplinary topics, 55(1), 1997, pp. 800-810
Protein coding segments (exons) exhibit persistent correlations betwee
n their nucleotides with a pronounced period three. It is shown in thi
s paper that this periodicity induced by the nonuniform codon usage im
plies long-range correlation over hundreds of base pairs if the length
distribution of exons is taken into account. We derive expressions wh
ich relate the length distribution of exons to the correlation decay a
nd find agreement with numerical simulations. Finally, we analyze the
decay of the mutual information function in yeast chromosomes, in an E
. coli chromosome region, and in myosin heavy chain genes as represent
ative examples. It turns out that in these cases we can explain most o
f the long-range statistical dependences even quantitatively.