Our basic observation is that each genome has a characteristic "signature"
defined as the ratios between the observed dinucleotide frequencies and the
frequencies expected if neighbors were chosen at random (dinucleotide rela
tive abundances). The remarkable fact is that the signature is relatively c
onstant throughout the genome; i.e., the patterns and levels of dinucleotid
e relative abundances of every 50-kb segment of the genome are about the sa
me. Comparison of the signatures of different genomes provides a measure of
similarity which has the advantage that it looks at all the DNA of an orga
nism and does not depend on the ability to align homologous sequences of sp
ecific genes. Genome signature comparisons show that plasmids, both special
ized and broad-range, and their hosts have substantially compatible (simila
r) genome signatures. Mammalian mitochondrial (Mt) genomes are very similar
, and animal and fungal Mt are generally moderately similar, but they diver
ge significantly from plant and protist Mt sets. Moreover, Mt genome signat
ure differences between species parallel the corresponding nuclear genome s
ignature differences, despite large differences between Mt and host nuclear
signatures. In signature terms, we find that the archaea are not a coheren
t clade, For example, Sulfolobus and Halobacterium are extremely divergent.
There is no consistent pattern of signature differences among thermophiles
. More generally, grouping prokaryotes by environmental criteria (e.g., hab
itat propensities, osmolarity tolerance, chemical conditions) reveals no co
rrelations in genome signature.