We calculated correlations of the nucleotide distributions along the E. col
i genome. Subsequent cluster analysis of the correlation distributions show
ed that the genome was composed of two qualitatively different types of nuc
leotide sequences. The first type exhibited strong correlations of the geno
mic distributions of A with T and G with C, and high anticorrelations of A
with C and G with T. In contrast, the second type was characterized by weak
or negligible correlations typical of randomized sequences, Both types of
sequences were almost equally abundant in the E, coli genome and their leng
th varied from several hundred nucleotides to about 70 kilobases, They were
not disjunct with respect to their (G + C) content but the high correlatio
ns and anticorrelations were rather characteristic for (A + T)-rich genomic
segments, We offer possible explanations of the mosaic structure of the E,
coli genome. (C) 2000 Academic Press.