We introduce a simple and rapid strategy to identify genes that are re
sponsible for species-specific phenotypes. The genome of a species tha
t has a specific phenotype is compared with at least one, closely rela
ted, species that lacks this phenotype, Homologous genes that are shar
ed among the species compared are identified and discarded from the li
st of candidates for species-specific genes. The process is automated
and rapidly yields a small subset of the genome that likely contains g
enes responsible for the species-specific features. Functions are assi
gned to the genes, and dubious annotations are filtered out, Informati
on is extracted not only from the presence of genes, but also from the
ir absence with respect to known phenotypes, We have applied the techn
ique to identify a set of species-specific genes in Helicobacter pylor
i by comparing it with its closest relatives for which complete genome
sequences are available, Haemophilus influenzae and Escherichia coli,
Of the genes of this set for which functional features can be obtaine
d, a large fraction (63%, 123 proteins) is (potentially) involved in H
, pylori's interaction with its host, We hypothesize that a family of
outer membrane proteins is critical for the ability of H, pylori to co
lonize host cells in highly acidic environments. (C) 1998 Federation o
f European Biochemical Societies.