Cam. Russo et al., EFFICIENCIES OF DIFFERENT GENES AND DIFFERENT TREE-BUILDING METHODS IN RECOVERING A KNOWN VERTEBRATE PHYLOGENY, Molecular biology and evolution, 13(3), 1996, pp. 525-536
The relative efficiencies of different protein-coding genes of the mit
ochondrial genome and different tree-building methods in recovering a
known vertebrate phylogeny (two whale species, cow, rat, mouse, opossu
m, chicken, frog, and three bony fish species) was evaluated. The tree
-building methods examined were the neighbor joining (NJ), minimum evo
lution (ME), maximum parsimony (MP), and maximum likelihood (ML), and
both nucleotide sequences and deduced amino acid sequences were analyz
ed. Generally speaking, amino acid sequences were better than nucleoti
de sequences in obtaining the true tree (topology) or trees close to t
he true tree. However, when only first and second codon positions data
were used, nucleotide sequences produced reasonably good trees. Among
the 13 genes examined, Nd5 produced the true tree in all tree-buildin
g methods or algorithms for both amino acid and nucleotide sequence da
ta. Genes Cytb and Nd4 also produced the correct tree in most tree-bui
lding algorithms when amino acid sequence data were used. By contrast,
Co2, Ndl, and Nd41 showed a poor performance. In general, large genes
produced better results, and when the entire set of genes was used, a
ll tree-building methods generated the true tree. In each tree-buildin
g method, several distance measures or algorithms were used, but all t
hese distance measures or algorithms produced essentially the same res
ults. The ME method, in which many different topologies are examined,
was no better than the NJ method, which generates a single final tree.
Similarly, an ML method, in which many topologies are examined, was n
o better than the ML star decomposition algorithm that generates a sin
gle final tree. In ML the best substitution model chosen by using the
Akaike information criterion produced no better results than simpler s
ubstitution models. These results question the utility of the currentl
y used optimization principles in phylogenetic construction. Relativel
y simple methods such as the NJ and ML star decomposition algorithms s
eem to produce as good results as those obtained by more sophisticated
methods. The efficiencies of the NJ, ME, MP, and ML methods in obtain
ing the correct tree were nearly the same when amino acid sequence dat
a were used. The most important factor in constructing reliable phylog
enetic trees seems to be the number of amino acids or nucleotides used
.