D. Bryant et P. Waddell, RAPID EVALUATION OF LEAST-SQUARES AND MINIMUM-EVOLUTION CRITERIA ON PHYLOGENETIC TREES, Molecular biology and evolution, 15(10), 1998, pp. 1346-1359
We present fast new algorithms for evaluating trees with respect to le
ast squares and minimum evolution (ME), the most commonly used criteri
a for inferring phylogenetic trees from distance data. The new algorit
hms include an optimal O(N-2) time algorithm for calculating the edge
(branch or internode) lengths on a tree according to ordinary or unwei
ghted least squares (OLS); an O(N-3) time algorithm for edge lengths u
nder weighted least squares (WLS) including the Fitch-Margoliash metho
d; and an optimal O(N-4) time algorithm for generalized least-squares
(GLS) edge lengths (where N is the number of taxa in the tree). The ME
criterion is based on the sum of edge lengths. Consequently, the edge
lengths algorithms presented here lead directly to O(N-2), O(N-3), an
d O(N-4) time algorithms for ME under OLS, WLS, and GLS, respectively.
All of these algorithms are as fast as or faster than any of those pr
eviously published, and the algorithms for OLS and GLS are the fastest
possible (with respect to order of computational complexity). A major
advantage of our new methods is that they are as well adapted to mult
ifurcating trees as they are to binary trees. An optimal algorithm for
determining path lengths from a tree with given edge lengths is also
developed. This leads to an optimal O(N-2) algorithm for OLS sums of s
quares evaluation and corresponding O(N-3) and O(N-4) time algorithms
for WLS and GLS sums of squares, respectively. The GLS algorithm is ti
me-optimal if the covariance matrix is already inverted. The speed of
each algorithm is assessed analytically-the speed increases we calcula
te are confirmed by the dramatic speed increases resulting from their
implementation in PAUP 4.0. The new algorithms enable far more extens
ive tree searches and statistical evaluations (e.g., bootstrap, parame
tric bootstrap, or jackknife) in the same amount of time. Hopefully, t
he fast algorithms for WLS and GLS will encourage the use of these cri
teria for evaluating trees and their edge lengths (e.g., for approxima
te divergence time estimates), since they should be more statistically
efficient than OLS.