S. Karlin et al., WHY IS CPG SUPPRESSED IN THE GENOMES OF VIRTUALLY ALL SMALL EUKARYOTIC VIRUSES BUT NOT IN THOSE OF LARGE EUKARYOTIC VIRUSES, Journal of virology, 68(5), 1994, pp. 2889-2897
Dinucleotide over- and underrepresentation is evaluated in all availab
le completely sequenced DNA or RNA viral genomes, ranging in size from
3 to 250 kb (available RNA viruses fall into the small-virus category
). The dinucleotide CpG is statistically underrepresented (suppressed)
in all but four of the small viruses (more than 75 with lengths of <3
0 kb) but has normal relative abundances in most large viruses (greate
r than or equal to 30 kb). Most retrotransposons in eukaryotic species
also show low CpG relative abundances. Interpretations, especially in
some cases of DNA viruses or viruses with a DNA intermediate, might r
elate to methylation effects and modes of viral integration and excisi
on. Other possible contributing factors relate to dinucleotide stackin
g energies, special mutation mechanisms, and evolutionary events.