Ja. Lake, RECONSTRUCTING EVOLUTIONARY TREES FROM DNA AND PROTEIN SEQUENCES - PARALINEAR DISTANCES, Proceedings of the National Academy of Sciences of the United Statesof America, 91(4), 1994, pp. 1455-1459
The reconstruction of phylogenetic trees from DNA and protein sequence
s is confounded by unequal rate effects. These effects can group rapid
ly evolving taxa with other rapidly evolving taxa, whether or not they
are genealogically related. All algorithms are sensitive to these eff
ects whenever the assumptions on which they are based are not met. The
algorithm presented here, called paralinear distances, is valid for a
much broader class of substitution processes than previous algorithms
and is accordingly less affected by unequal rate effects. It may be u
sed with all nucleic acid, protein, or other sequences, provided that
their evolution may be modeled as a succession of Markov processes. Th
e properties of the method have been proven both analytically and by c
omputer simulations. Like all other methods, paralinear distances can
fail when sequences are misaligned or when site-to-site sequence varia
tion of rates is extensive. To examine the usefulness of paralinear di
stances, the ''origin of the eukaryotes'' has been investigated by the
analysis of elongation factor Tu sequences with a variety of sequence
alignments. It has been found that the order in which sequences are p
airwise aligned strongly determines the topology which is reconstructe
d by paralinear distances (as it does for all other reconstruction met
hods tested). When the parts of the alignment that are unaffected by a
lignment order are analyzed, paralinear distances strongly select the
eocyte topology. This provides evidence that the eocyte prokaryotes ar
e the closest prokaryotic relatives of the eukaryotes.