PROTEIN FAMILY CLASSIFICATION BASED ON SEARCHING A DATABASE OF BLOCKS

Citation
S. Henikoff et Jg. Henikoff, PROTEIN FAMILY CLASSIFICATION BASED ON SEARCHING A DATABASE OF BLOCKS, Genomics, 19(1), 1994, pp. 97-107
Citations number
44
Categorie Soggetti
Genetics & Heredity
Journal title
ISSN journal
08887543
Volume
19
Issue
1
Year of publication
1994
Pages
97 - 107
Database
ISI
SICI code
0888-7543(1994)19:1<97:PFCBOS>2.0.ZU;2-R
Abstract
The most highly conserved regions of proteins can be represented as '' blocks'' of locally aligned sequence segments. Previously, an automate d system was introduced to generate a database of blocks that is searc hed for local similarities using a sequence query. Here, we describe a method for searching this database that can also reveal significant g lobal similarities. Local and global alignments are scored independent ly, so they can be used in concert to infer homology. A set of 7082 di verse sequences not represented in the database provided queries for t esting this approach. The resulting distributions of scores led to gui delines for interpretation of search data and to the classification of 289 uncatalogued sequences into known groups. Thirty-eight of these r elationships appear to be new discoveries. We also show how searching a database of blocks can be used to detect repeated domains and to fin d distinct cross-family relationships that were missed in searches of sequence databases. (C) 1994 Academic Press, Inc.