Skew of mononucleotide frequencies, relative abundance of dinucleotides, and DNA strand asymmetry

Citation
C. Shioiri et N. Takahata, Skew of mononucleotide frequencies, relative abundance of dinucleotides, and DNA strand asymmetry, J MOL EVOL, 53(4-5), 2001, pp. 364-376
Citations number
32
Categorie Soggetti
Biology,"Experimental Biology
Journal title
JOURNAL OF MOLECULAR EVOLUTION
ISSN journal
00222844 → ACNP
Volume
53
Issue
4-5
Year of publication
2001
Pages
364 - 376
Database
ISI
SICI code
0022-2844(200110/11)53:4-5<364:SOMFRA>2.0.ZU;2-R
Abstract
Based on 152 mitochondrial genomes and 36 bacterial chromosomes that have b een completely sequenced, as well as three long contigs for human chromosom es 6. 21, and 22, we examined skews of mononucleotide frequencies and the r elative abundance of dinucleotides in one DNA strand. Each group of these g enomes has its own characteristics. Regarding mitochondrial genomes, both C (p)G and G(p)T are underrepresented, while either G(p)G or CpC or both are overrepresented. The relative frequency of nucleotide T vs A and of nucleot ide G vs C is strongly skewed, due presumably to strand asymmetry in replic ation errors and unidirectional DNA replication from single origins. Except ions are found in the plant and yeast mitochondrial genomes, each of which may replicate from multiple origins. Regarding bacterial genomes, the "univ ersal" rule of C(p)G deficiency is restricted to archaebacteria and some eu bacteria. In other eubacteria, the most underrepresented dinucleotide is ei ther T(p)A or G(p)T. In general, there are significant T vs A and G vs C sk ews in each half of the bacterial genome, although these are almost exactly canceled out over the whole genome. Regarding human chromosomes 6, 21, and 22, dinucleotide C(p)G tends to be avoided. The relative frequency of mono nucleotides exhibits conspicuous local skews, suggesting that each of these chromosomal segments contains more than one DNA replication origin. It is concluded that, when there are several replicons in a genomic region, not o nly the number of DNA replication origins but also the directionality is im portant and that the observed patterns of nucleotide frequencies in the gen ome strongly support the hypothesis of strand asymmetry in replication erro rs.