S. Whelan et N. Goldman, Distributions of statistics used for the comparison of models of sequence evolution in phylogenetics, MOL BIOL EV, 16(9), 1999, pp. 1292-1299
Asymptotic statistical theory suggests that when two nested models are comp
ared by a likelihood ratio test, a chi(2) distribution, with number of degr
ees of freedom equal to the difference in numbers of free parameters of the
two models, can be used for significance testing. This asymptotic result h
as been assumed to apply in phylogenetics with the support of only a few st
udies. In this paper, 12 comparisons among a selection of commonly used mod
els of nucleotide substitution were examined to see whether this assumption
is reasonable. The true distributions of likelihood ratio statistics were
estimated by computer simulation and compared with the appropriate chi(2) d
istributions. It was found that chi 2 distributions are adequate for signif
icance testing in the comparison of models differing by parameters describi
ng transition/transversion bias and/or unequal base frequencies when these
parameters have been estimated by maximum likelihood. The chi(2) distributi
on was, however, found to be significantly different from the true distribu
tions in the comparison of models differing by parameters describing rate v
ariation across sites (estimated by maximum likelihood) or unequal base fre
quencies (estimated as the observed base frequencies in an alignment). Thes
e last findings may have important consequences for real-model comparisons
and for the construction of increasingly complex and realistic models of nu
cleotide sequence evolution.