C. Shioiri et N. Takahata, Skew of mononucleotide frequencies, relative abundance of dinucleotides, and DNA strand asymmetry, J MOL EVOL, 53(4-5), 2001, pp. 364-376
Based on 152 mitochondrial genomes and 36 bacterial chromosomes that have b
een completely sequenced, as well as three long contigs for human chromosom
es 6. 21, and 22, we examined skews of mononucleotide frequencies and the r
elative abundance of dinucleotides in one DNA strand. Each group of these g
enomes has its own characteristics. Regarding mitochondrial genomes, both C
(p)G and G(p)T are underrepresented, while either G(p)G or CpC or both are
overrepresented. The relative frequency of nucleotide T vs A and of nucleot
ide G vs C is strongly skewed, due presumably to strand asymmetry in replic
ation errors and unidirectional DNA replication from single origins. Except
ions are found in the plant and yeast mitochondrial genomes, each of which
may replicate from multiple origins. Regarding bacterial genomes, the "univ
ersal" rule of C(p)G deficiency is restricted to archaebacteria and some eu
bacteria. In other eubacteria, the most underrepresented dinucleotide is ei
ther T(p)A or G(p)T. In general, there are significant T vs A and G vs C sk
ews in each half of the bacterial genome, although these are almost exactly
canceled out over the whole genome. Regarding human chromosomes 6, 21, and
22, dinucleotide C(p)G tends to be avoided. The relative frequency of mono
nucleotides exhibits conspicuous local skews, suggesting that each of these
chromosomal segments contains more than one DNA replication origin. It is
concluded that, when there are several replicons in a genomic region, not o
nly the number of DNA replication origins but also the directionality is im
portant and that the observed patterns of nucleotide frequencies in the gen
ome strongly support the hypothesis of strand asymmetry in replication erro
rs.