Jj. Wiens et Mr. Servedio, PHYLOGENETIC ANALYSIS AND INTRASPECIFIC VARIATION - PERFORMANCE OF PARSIMONY, LIKELIHOOD, AND DISTANCE METHODS, Systematic biology, 47(2), 1998, pp. 228-253
Intraspecific variation is abundant in all types of systematic charact
ers but is rarely addressed in simulation studies of phylogenetic meth
od performance. We compared the accuracy of 15 phylogenetic methods us
ing simulations to (1) determine the most accurate method(s) for analy
zing polymorphic data (under simplified conditions) and (2) test if ge
neralizations about the performance of phylogenetic methods based on p
revious simulations of fixed (nonpolymorphic) characters ape robust to
a very different evolutionary model that explicitly includes intraspe
cific variation. Simulated data sets consisted of allele frequencies t
hat evolved by genetic drift. The phylogenetic methods included eight
parsimony coding methods, continuous maximum likelihood, and three dis
tance methods (UPGMA, neighbor joining; and Fitch-Margoliash) applied
to two genetic distance measures (Nei's and the modified Cavalli-Sforz
a and Edwards chord distance). Two sets of simulations were performed.
The first examined the effects of different branch lengths, sample si
zes (individuals sampled per species), numbers of characters, and numb
ers of alleles per locus in the eight-taxon case. The second examined
more extensively the effects of branch length in the four-taxon, two-a
llele case. Overall, the most accurate methods were likelihood, the ad
ditive distance methods (neighbor joining and Fitch-Margoliash), and t
he frequency parsimony method. Despite the use of a very different evo
lutionary model in the present article, many of the results are simila
r to those from simulations of fixed characters. Similarities include
the presence of the ''Felsenstein zone,'' where methods often fail, wh
ich suggests that long-branch attraction may occur among closely relat
ed species through genetic drift. Differences between the results of f
ixed and polymorphic data simulations include the following: (1) UPGMA
is as accurate or more accurate than nonfrequency parsimony methods a
cross nearly all combinations of branch lengths, and (2) likelihood an
d the additive distance methods are not positively misled under any co
mbination of branch lengths tested (even when the assumptions of the m
ethods are violated and few characters are sampled). We found that sam
ple size is an important determinant of accuracy and affects the relat
ive success of methods (i.e., distance and Likelihood methods outperfo
rm parsimony at small sample sizes). Attempts to generalize about the
behavior of phylogenetic methods should consider the extreme examples
offered by fixed-mutation models of DNA sequence data and genetic-drif
t models of allele frequencies.