De. Soltis et al., INFERRING COMPLEX PHYLOGENIES USING PARSIMONY - AN EMPIRICAL-APPROACHUSING 3 LARGE DNA DATA SETS FOR ANGIOSPERMS, Systematic biology, 47(1), 1998, pp. 32-42
To explore the feasibility of parsimony analysis for large data sets,
we conducted heuristic parsimony searches and bootstrap analyses on se
parate and combined DNA data sets for 190 angiosperms and three outgro
ups. Separate data sets of 185 rDNA (1,855 bp), rbcL (1,428 bp), and a
tpB (1,450 bp) sequences were combined into a single matrix 4,733 bp i
n length. Analyses of the combined data set show great improvements in
computer run times compared to those of the separate data sets and of
the data sets combined in pairs. Six searches of the 18S rDNA + rbcL
+ atpB data set were conducted; in all cases TBR branch swapping was c
ompleted, generally within a few days. In contrast, TBR branch swappin
g was not completed for any of the three separate data sets, or for th
e pairwise combined data sets. These results illustrate that it is pos
sible to conduct a thorough search of tree space with large data sets,
given sufficient signal. In this case and probably most others, suffi
cient signal for a large number of taxa can only be obtained by combin
ing data sets. The combined data sets also have higher internal suppor
t for clades than the separate data sets, and more clades receive boot
strap support of greater than or equal to 50% in the combined analysis
than in analyses of the separate data sets. These data suggest that o
ne solution to the computational and analytical dilemmas posed by larg
e data sets is the addition of nucleotides, as well as taxa.