Key residues approach to the definition of protein families and analysis of sparse family signatures

Citation
Jc. Ison et al., Key residues approach to the definition of protein families and analysis of sparse family signatures, PROTEINS, 40(2), 2000, pp. 330-341
Citations number
25
Categorie Soggetti
Biochemistry & Biophysics
Journal title
PROTEINS-STRUCTURE FUNCTION AND GENETICS
ISSN journal
08873585 → ACNP
Volume
40
Issue
2
Year of publication
2000
Pages
330 - 341
Database
ISI
SICI code
0887-3585(20000801)40:2<330:KRATTD>2.0.ZU;2-I
Abstract
We extend the concept of the motif as a tool for characterizing protein fam ilies and explore the feasibility of a sparse "motif" that is the length of the protein sequence itself, The type of motif discussed is a sparse famil y signature consisting of a set of N key residue positions (A1,A2...AN) pre ceded by gaps (G) thus G1A1G2A2....GNAN. Both a residue and gap can be vari able. A signature is matched to a protein sequence and scored using a dynam ic programming algorithm which permits variability in gap distance and resi due type. Generating a signature involves identifying residues associated w ith points of contact in interactions between secondary structure elements. A raw signature consists of a set of positions with potential key structur al roles sampled from a sequence alignment constructed with reference to th is contact data. Raw signatures are refined by sampling different gap-resid ue pairs until the specificity of a signature for the family cannot be furt her improved. We summarize signatures for nine families of protein of diver se fold and function and present results of scans against the OWL protein s equence database. The implications of such signatures are discussed. (C) 20 00 Wiley-Liss, Inc.