PHYLOGENETIC ANALYSIS AND INTRASPECIFIC VARIATION - PERFORMANCE OF PARSIMONY, LIKELIHOOD, AND DISTANCE METHODS

Citation
Jj. Wiens et Mr. Servedio, PHYLOGENETIC ANALYSIS AND INTRASPECIFIC VARIATION - PERFORMANCE OF PARSIMONY, LIKELIHOOD, AND DISTANCE METHODS, Systematic biology, 47(2), 1998, pp. 228-253
Citations number
56
Categorie Soggetti
Biology Miscellaneous
Journal title
ISSN journal
10635157
Volume
47
Issue
2
Year of publication
1998
Pages
228 - 253
Database
ISI
SICI code
1063-5157(1998)47:2<228:PAAIV->2.0.ZU;2-P
Abstract
Intraspecific variation is abundant in all types of systematic charact ers but is rarely addressed in simulation studies of phylogenetic meth od performance. We compared the accuracy of 15 phylogenetic methods us ing simulations to (1) determine the most accurate method(s) for analy zing polymorphic data (under simplified conditions) and (2) test if ge neralizations about the performance of phylogenetic methods based on p revious simulations of fixed (nonpolymorphic) characters ape robust to a very different evolutionary model that explicitly includes intraspe cific variation. Simulated data sets consisted of allele frequencies t hat evolved by genetic drift. The phylogenetic methods included eight parsimony coding methods, continuous maximum likelihood, and three dis tance methods (UPGMA, neighbor joining; and Fitch-Margoliash) applied to two genetic distance measures (Nei's and the modified Cavalli-Sforz a and Edwards chord distance). Two sets of simulations were performed. The first examined the effects of different branch lengths, sample si zes (individuals sampled per species), numbers of characters, and numb ers of alleles per locus in the eight-taxon case. The second examined more extensively the effects of branch length in the four-taxon, two-a llele case. Overall, the most accurate methods were likelihood, the ad ditive distance methods (neighbor joining and Fitch-Margoliash), and t he frequency parsimony method. Despite the use of a very different evo lutionary model in the present article, many of the results are simila r to those from simulations of fixed characters. Similarities include the presence of the ''Felsenstein zone,'' where methods often fail, wh ich suggests that long-branch attraction may occur among closely relat ed species through genetic drift. Differences between the results of f ixed and polymorphic data simulations include the following: (1) UPGMA is as accurate or more accurate than nonfrequency parsimony methods a cross nearly all combinations of branch lengths, and (2) likelihood an d the additive distance methods are not positively misled under any co mbination of branch lengths tested (even when the assumptions of the m ethods are violated and few characters are sampled). We found that sam ple size is an important determinant of accuracy and affects the relat ive success of methods (i.e., distance and Likelihood methods outperfo rm parsimony at small sample sizes). Attempts to generalize about the behavior of phylogenetic methods should consider the extreme examples offered by fixed-mutation models of DNA sequence data and genetic-drif t models of allele frequencies.