ITA
ENG

DETECTION OF CONSERVED SEGMENTS IN PROTEINS - ITERATIVE SCANNING OF SEQUENCE DATABASES WITH ALIGNMENT BLOCKS

Authors

TATUSOV RL ALTSCHUL SF KOONIN EV

Citation

Rl. Tatusov et al., DETECTION OF CONSERVED SEGMENTS IN PROTEINS - ITERATIVE SCANNING OF SEQUENCE DATABASES WITH ALIGNMENT BLOCKS, Proceedings of the National Academy of Sciences of the United Statesof America, 91(25), 1994, pp. 12091-12095

Citations number

Categorie Soggetti

Multidisciplinary Sciences

Journal title

Proceedings of the National Academy of Sciences of the United Statesof America → ACNP

ISSN journal

00278424

Volume

Issue

Year of publication

1994

Pages

12091 - 12095

Database

ISI

SICI code

0027-8424(1994)91:25<12091:DOCSIP>2.0.ZU;2-D

Abstract

We describe an approach to analyzing protein sequence databases that, starting from a single uncharacterized sequence or group of related se quences, generates blocks of conserved segments. The procedure involve s iterative database scans with an evolving position-dependent weight matrix constructed from a coevolving set of aligned conserved segments . For each iteration, the expected distribution of matrix scores under a random model is used to set a cutoff score for the inclusion of a s egment in the next iteration. This cutoff may be calculated to allow t he chance inclusion of either a fixed number or a fixed proportion of false positive segments. With sufficiently high cutoff scores, the pro cedure converged for all alignment blocks studied, with varying number s of iterations required. Different methods for calculating weight mat rices from alignment blocks were compared. The most effective of those tested was a logarithm-of-odds, Bayesian-based approach that used pri or residue probabilities calculated from a mixture of Dirichlet distri butions. The procedure described was used to detect novel conserved mo ths of potential biological importance.