A. Bendor et al., CONSTRUCTING PHYLOGENIES FROM QUARTETS - ELUCIDATION OF EUTHERIAN SUPERORDINAL RELATIONSHIPS, Journal of computational biology, 5(3), 1998, pp. 377-390
Citations number
43
Categorie Soggetti
Mathematics,Biology,"Biochemical Research Methods",Mathematics,"Biothechnology & Applied Migrobiology
In this work we present two new approaches for constructing phylogenet
ic trees. The input is a list of weighted quartets over it taxa, Each
quartet is a subtree on four taxa, and its weight represents a confide
nce level for the specific topology. The goal is to construct a binary
tree with It leaves such that the total weight of the satisfied quart
ets is maximized (an NP hard problem). The first approach we present i
s based on geometric ideas. Using semidefinite programming, we embed t
he n points on the n-dimensional unit sphere, while maximizing an obje
ctive function. This function depends on Euclidean distances between t
he four points and reflects the quartet topology. Given the embedding,
we construct a binary tree by performing geometric clustering. This p
rocess is similar to the traditional neighbor joining, with the differ
ence that the update phase retains geometric meaning: When two neighbo
rs are joined together, their common ancestor is taken to be the cente
r of mass of the original points. The geometric algorithm runs in poly
(n) time, but there are no guarantees on the quality of its output. In
contrast, our second algorithm is based on dynamic programming, and i
t is guaranteed to find the optimal tree (with respect to the given qu
artets), Its running time is a modest exponential, so it can be implem
ented for modest values of it, We have implemented both algorithms and
ran them on real data for n = 15 taxa (14 mammalian orders and an out
group taxon), The two resulting trees improve previously published tre
es and seem to be of biological relevance. On this dataset, the geomet
ric algorithm produced a tree whose score is 98.2% of the optimal valu
e on this input set (72.1% vs. 73.4%), This gives rise to the hope tha
t the geometric approach will prove viable even for larger cases where
the exponential, dynamic programming approach is no longer feasible.