Dp. Kreil et Ca. Ouzounis, Identification of thermophilic species by the amino acid compositions deduced from their genomes, NUCL ACID R, 29(7), 2001, pp. 1608-1615
The global amino acid compositions as deduced from the complete genomic seq
uences of six thermophilic archaea, two thermophilic bacteria, 17 mesophili
c bacteria and two eukaryotic species were analysed by hierarchical cluster
ing and principal components analysis, Both methods showed an influence of
several factors on amino acid composition. Although GC content has a domina
nt effect, thermophilic species can be identified by their global amino aci
d compositions alone. This study presents a careful statistical analysis of
factors that affect amino acid composition and also yielded specific featu
res of the average amino acid composition of thermophilic species. Moreover
, we introduce the first example of a 'compositional tree' of species that
takes into account not only homologous proteins, but also proteins unique t
o particular species, We expect this simple yet novel approach to be a usef
ul additional tool for the study of phylogeny at the genome level.