DETECTION OF CONSERVED SEGMENTS IN PROTEINS - ITERATIVE SCANNING OF SEQUENCE DATABASES WITH ALIGNMENT BLOCKS

Citation
Rl. Tatusov et al., DETECTION OF CONSERVED SEGMENTS IN PROTEINS - ITERATIVE SCANNING OF SEQUENCE DATABASES WITH ALIGNMENT BLOCKS, Proceedings of the National Academy of Sciences of the United Statesof America, 91(25), 1994, pp. 12091-12095
Citations number
47
Categorie Soggetti
Multidisciplinary Sciences
ISSN journal
00278424
Volume
91
Issue
25
Year of publication
1994
Pages
12091 - 12095
Database
ISI
SICI code
0027-8424(1994)91:25<12091:DOCSIP>2.0.ZU;2-D
Abstract
We describe an approach to analyzing protein sequence databases that, starting from a single uncharacterized sequence or group of related se quences, generates blocks of conserved segments. The procedure involve s iterative database scans with an evolving position-dependent weight matrix constructed from a coevolving set of aligned conserved segments . For each iteration, the expected distribution of matrix scores under a random model is used to set a cutoff score for the inclusion of a s egment in the next iteration. This cutoff may be calculated to allow t he chance inclusion of either a fixed number or a fixed proportion of false positive segments. With sufficiently high cutoff scores, the pro cedure converged for all alignment blocks studied, with varying number s of iterations required. Different methods for calculating weight mat rices from alignment blocks were compared. The most effective of those tested was a logarithm-of-odds, Bayesian-based approach that used pri or residue probabilities calculated from a mixture of Dirichlet distri butions. The procedure described was used to detect novel conserved mo ths of potential biological importance.