COMPLETE FAMILIES OF LINEAR INVARIANTS FOR SOME STOCHASTIC-MODELS OF SEQUENCE EVOLUTION, WITH AND WITHOUT THE MOLECULAR CLOCK ASSUMPTION

Authors
Citation
Md. Hendy et D. Penny, COMPLETE FAMILIES OF LINEAR INVARIANTS FOR SOME STOCHASTIC-MODELS OF SEQUENCE EVOLUTION, WITH AND WITHOUT THE MOLECULAR CLOCK ASSUMPTION, Journal of computational biology, 3(1), 1996, pp. 19-31
Citations number
11
Categorie Soggetti
Biology,"Biochemical Research Methods",Mathematics
ISSN journal
10665277
Volume
3
Issue
1
Year of publication
1996
Pages
19 - 31
Database
ISI
SICI code
1066-5277(1996)3:1<19:CFOLIF>2.0.ZU;2-1
Abstract
For various models of sequence evolution, the set of linear functions of the frequencies of the nucleotide patterns forms a vector space, th e invariant space. Here we distinguish between the model of nucleotide substitution, and the phylogenetic tree T describing the paths on whi ch these changes occur. We describe a procedure to construct a basis o f the invariant space for those models that are extensions of models i ncorporating Kimura's three substitution model of nucleotide change, i ncluding both the Jukes-Cantor and Cavender-Farris models. The dimensi on of the invariant space is determined, for those models where it is independent of the tree topology, as a function of the number of seque nces. These are calculated where the nucleotide distribution at the ro ot is unspecified, and both with, and without, the assumption of the m olecular clock hypothesis. The invariants have a number of potential a pplications, including tree identification, and testing the fit of mod els (which could include the molecular clock) to sequence data.