CHARACTER-STATE WEIGHTING FOR CLADISTIC-ANALYSIS OF PROTEIN-CODING DNA-SEQUENCES

Citation
Va. Albert et al., CHARACTER-STATE WEIGHTING FOR CLADISTIC-ANALYSIS OF PROTEIN-CODING DNA-SEQUENCES, Annals of the Missouri Botanical Garden, 80(3), 1993, pp. 752-766
Citations number
105
Categorie Soggetti
Plant Sciences
ISSN journal
00266493
Volume
80
Issue
3
Year of publication
1993
Pages
752 - 766
Database
ISI
SICI code
0026-6493(1993)80:3<752:CWFCOP>2.0.ZU;2-8
Abstract
Nucleotide data are a restricted character system complex enough to co nfound phylogenetic analyses yet simple enough to permit establishment of probability models for sequence change and corresponding character -state weighting schemes. We have previously developed a general metho d for weighting DNA data that is here elaborated for protein-coding se quences. Included in the present model are corrections for (i) multipl e substitution events, (ii) transition/transversion bias, and (iii) di fferential proportions of changes occurring at first, second, and thir d codon positions. This model is shown to be generally consistent for all phylogenetically useful data. Greater understanding of the propert ies of equal versus differential character-state weighting comes from consideration of numbers of terminal taxa and lengths of tree segments . With insufficient sampling of taxa, differential weighting attempts to correct for undetected multiple substitution events. Both equal and differential weighting should give the same result if sufficient numb ers of terminal taxa permit the detection of historically misleading c haracter-state changes. Nevertheless, spurious attraction of tree segm ents remains a systematic problem that is not easily resolved either b y equal weighting or by our differential weighting model, which acts g lobally rather than adjusting for different probabilities of character -state change among tree segments. Artifactual segment attraction is b est understood in terms of asymmetries in lambda (which represents sta te changes per character during a particular segment interval). We rel ate the consistency index to numbers of terminal taxa and lambda, illu strating its dependence upon numbers of potential tree segments. Prosp ects for phylogenetic reconstruction from protein-coding nucleotide da ta are discussed with reference to the robustness of equal weighting ( given our own model) with adequate taxonomic sampling.