Y. Van De Peer et al., An updated and comprehensive rRNA phylogeny of (crown) eukaryotes based onrate-calibrated evolutionary distances, J MOL EVOL, 51(6), 2000, pp. 565-576
Recent experience with molecular phylogeny has shown that all molecular mar
kers have strengths and weaknesses. Nonetheless, despite several notable di
screpancies with phylogenies obtained from protein data, the merits of the
small subunit ribosomal RNA (SSU rRNA) as a molecular phylogenetic marker r
emain indisputable. Over the last 10 to 15 years a massive SSU rRNA databas
e has been gathered, including more then 3000 complete sequences from eukar
yotes. This creates a huge computational challenge, which is exacerbated by
phenomena such as extensive rate variation among sites in the molecule. A
few years ago, a fast phylogenetic method was developed that takes into acc
ount among-site rate variation in the estimation of evolutionary distances.
This "substitution rate calibration" (SRC) method not only corrects for a
major source of artifacts in phylogeny reconstruction but, because it is ba
sed on a distance approach, allows comprehensive trees including thousands
of sequences to be constructed in a reasonable amount of time. In this stud
y, a nucleotide variability map and a phylogenetic tree were constructed, u
sing the SRC method, based on all available (January 2000) complete SSU rRN
A sequences (2551) for species belonging to the so-called eukaryotic crown.
The resulting phylogeny constitutes the most complete description of overa
ll eukaryote diversity and relationships to date. Furthermore, branch lengt
hs estimated with the SRC method better reflect the huge differences in evo
lutionary rates among and within eukaryotic lineages. The ribosomal RNA tre
e is compared with a recent protein phylogeny obtained from concatenated ac
tin, alpha -tubulin, beta -tubulin, and elongation factor 1-alpha amino aci
d sequences. A consensus phylogeny of the eukaryotic crown based on current
ly available molecular data is discussed, as well as specific problems enco
untered in analyzing sequences when large differences in substitution rate
are present, either between different sequences (rate variation among linea
ges) or between different positions within the same sequence (among-site ra
te variation).