Separation of phylogenetic and functional associations in biological sequences by using the parametric bootstrap

Citation
Kr. Wollenberg et Wr. Atchley, Separation of phylogenetic and functional associations in biological sequences by using the parametric bootstrap, P NAS US, 97(7), 2000, pp. 3288-3291
Citations number
25
Categorie Soggetti
Multidisciplinary
Journal title
PROCEEDINGS OF THE NATIONAL ACADEMY OF SCIENCES OF THE UNITED STATES OF AMERICA
ISSN journal
00278424 → ACNP
Volume
97
Issue
7
Year of publication
2000
Pages
3288 - 3291
Database
ISI
SICI code
0027-8424(20000328)97:7<3288:SOPAFA>2.0.ZU;2-W
Abstract
Quantitative analyses of biological sequences generally proceed under the a ssumption that individual DNA or protein sequence elements vary independent ly. However, this assumption is not biologically realistic because sequence elements often vary in a concerted manner resulting from common ancestry a nd structural or functional constraints. We calculated intersite associatio ns among aligned protein sequences by using mutual information. To discrimi nate associations resulting from common ancestry from those resulting from structural or functional constraints, we used a parametric bootstrap algori thm to construct replicate data sets. These data are expected to have inter site associations resulting solely from phytogeny. By comparing the distrib ution of our association statistic for the replicate data against that calc ulated for empirical data, we were able to assign a probability that two si tes covaried resulting from structural or functional constraint rather than phylogeny. We tested our method by using an alignment of 237 basic helix-l oop-helix (bHLH) protein domains. Comparison of our results against a solve d three-dimensional structure confirmed the identification of several sites important to function and structure of the bHLH domain. This analytical pr ocedure has broad utility as a first step in the identification of sites th at are important to biological macromolecular structure and function when a solved structure is unavailable.