Species phylogenies derived from comparisons of single genes are rarely con
sistent with each other, due to horizontal gene transfer(1), unrecognized p
aralogy and highly variable rates of evolution(2). The advent of completely
sequenced genomes allows the construction of a phylogeny that is less sens
itive to such inconsistencies and more representative of whole-genomes than
are single-gene trees. Here, we present a distance-based phytogeny(3) cons
tructed on the basis of gene content, rather than on sequence identity, of
13 completely sequenced genomes of unicellular species. The similarity betw
een two species is defined as the number of genes that they have in common
divided by their total number of genes, Tn this type of phylogenetic analys
is, evolutionary distance can be interpreted in terms of evolutionary event
s such as the acquisition and loss of genes, whereas the underlying propert
ies (the gene content) can be interpreted in terms of function. As such, it
takes a position intermediate to phylogenies based on single genes and phy
logenies based on phenotypic characteristics. Although our comprehensive ge
nome phylogeny is independent of phylogenies based on the level of sequence
identity of individual genes, it correlates with the standard reference of
prokarytic phylogeny based on sequence similarity of 16s rRNA (ref, 4). Th
us, shared gene content between genomes is quantitatively determined by phy
logeny, rather than by phenotype, and horizontal gene transfer has only a l
imited role in determining the gene content of genomes.