Distributions of statistics used for the comparison of models of sequence evolution in phylogenetics

Citation
S. Whelan et N. Goldman, Distributions of statistics used for the comparison of models of sequence evolution in phylogenetics, MOL BIOL EV, 16(9), 1999, pp. 1292-1299
Citations number
21
Categorie Soggetti
Biology,"Experimental Biology
Journal title
MOLECULAR BIOLOGY AND EVOLUTION
ISSN journal
07374038 → ACNP
Volume
16
Issue
9
Year of publication
1999
Pages
1292 - 1299
Database
ISI
SICI code
0737-4038(199909)16:9<1292:DOSUFT>2.0.ZU;2-B
Abstract
Asymptotic statistical theory suggests that when two nested models are comp ared by a likelihood ratio test, a chi(2) distribution, with number of degr ees of freedom equal to the difference in numbers of free parameters of the two models, can be used for significance testing. This asymptotic result h as been assumed to apply in phylogenetics with the support of only a few st udies. In this paper, 12 comparisons among a selection of commonly used mod els of nucleotide substitution were examined to see whether this assumption is reasonable. The true distributions of likelihood ratio statistics were estimated by computer simulation and compared with the appropriate chi(2) d istributions. It was found that chi 2 distributions are adequate for signif icance testing in the comparison of models differing by parameters describi ng transition/transversion bias and/or unequal base frequencies when these parameters have been estimated by maximum likelihood. The chi(2) distributi on was, however, found to be significantly different from the true distribu tions in the comparison of models differing by parameters describing rate v ariation across sites (estimated by maximum likelihood) or unequal base fre quencies (estimated as the observed base frequencies in an alignment). Thes e last findings may have important consequences for real-model comparisons and for the construction of increasingly complex and realistic models of nu cleotide sequence evolution.