CONSTRUCTING PHYLOGENIES FROM QUARTETS - ELUCIDATION OF EUTHERIAN SUPERORDINAL RELATIONSHIPS

Citation
A. Bendor et al., CONSTRUCTING PHYLOGENIES FROM QUARTETS - ELUCIDATION OF EUTHERIAN SUPERORDINAL RELATIONSHIPS, Journal of computational biology, 5(3), 1998, pp. 377-390
Citations number
43
Categorie Soggetti
Mathematics,Biology,"Biochemical Research Methods",Mathematics,"Biothechnology & Applied Migrobiology
ISSN journal
10665277
Volume
5
Issue
3
Year of publication
1998
Pages
377 - 390
Database
ISI
SICI code
1066-5277(1998)5:3<377:CPFQ-E>2.0.ZU;2-R
Abstract
In this work we present two new approaches for constructing phylogenet ic trees. The input is a list of weighted quartets over it taxa, Each quartet is a subtree on four taxa, and its weight represents a confide nce level for the specific topology. The goal is to construct a binary tree with It leaves such that the total weight of the satisfied quart ets is maximized (an NP hard problem). The first approach we present i s based on geometric ideas. Using semidefinite programming, we embed t he n points on the n-dimensional unit sphere, while maximizing an obje ctive function. This function depends on Euclidean distances between t he four points and reflects the quartet topology. Given the embedding, we construct a binary tree by performing geometric clustering. This p rocess is similar to the traditional neighbor joining, with the differ ence that the update phase retains geometric meaning: When two neighbo rs are joined together, their common ancestor is taken to be the cente r of mass of the original points. The geometric algorithm runs in poly (n) time, but there are no guarantees on the quality of its output. In contrast, our second algorithm is based on dynamic programming, and i t is guaranteed to find the optimal tree (with respect to the given qu artets), Its running time is a modest exponential, so it can be implem ented for modest values of it, We have implemented both algorithms and ran them on real data for n = 15 taxa (14 mammalian orders and an out group taxon), The two resulting trees improve previously published tre es and seem to be of biological relevance. On this dataset, the geomet ric algorithm produced a tree whose score is 98.2% of the optimal valu e on this input set (72.1% vs. 73.4%), This gives rise to the hope tha t the geometric approach will prove viable even for larger cases where the exponential, dynamic programming approach is no longer feasible.