Combining data sets with different phylogenetic histories

Authors
Citation
Jj. Wiens, Combining data sets with different phylogenetic histories, SYST BIOL, 47(4), 1998, pp. 568-581
Citations number
28
Categorie Soggetti
Biology
Journal title
SYSTEMATIC BIOLOGY
ISSN journal
10635157 → ACNP
Volume
47
Issue
4
Year of publication
1998
Pages
568 - 581
Database
ISI
SICI code
1063-5157(199812)47:4<568:CDSWDP>2.0.ZU;2-X
Abstract
The possibility that two data sets may have different underlying: phylogene tic histories (such as gene trees that deviate from species trees) has beco me an important argument against combining data in phylogenetic analysis. H owever, two data sets sampled for a large number of taxa may differ in only part of their histories. This is a realistic scenario and one in which the relative advantages of combined, separate, and consensus analysis become m uch less clear. I propose a simple methodology for dealing with this situat ion that involves (1) partitioning the available data to maximize detection of different histories, (2) performing separate analyses of the data sets, and (3) combining the data but considering questionable or unresolved thos e parts of the combined tree that are strongly contested in the separate an alyses (and which therefore may have different histories) until a majority of unlinked data sets support one resolution over another. In support of th is methodology, computer simulations suggest that (1) the accuracy of combi ned analysis for recovering the true species phylogeny may exceed that of e ither of two separately analyzed data sets under some conditions, particula rly when the mismatch between phylogenetic histories is small and the estim ates of the underlying histories are imperfect (few characters, high homopl asy, or both) and (2) combined analysis provides a poor estimate of the spe cies tree in areas of the phylogenies with different histories but gives an improved estimate in regions that share the same history. Thus, when there is a localized mismatch between the histories of two data sets, the separa te, consensus, and combined analyses may all give unsatisfactory results in certain parts of the phylogeny. Similarly, approaches that allow data comb ination only after a global test of heterogeneity will suffer from the pote ntial failings of either separate or combined analysis, depending on the ou tcome of the test. Excision of conflicting taxa is also problematic, in tha t doing so may obfuscate the position of conflicting taxa within a larger t ree, even when their placement is congruent between data sets. Application of the proposed methodology to molecular and morphological data sets fur Sc eloporus lizards is discussed.