FINDING FLEXIBLE PATTERNS IN UNALIGNED PROTEIN SEQUENCES

Citation
I. Jonassen et al., FINDING FLEXIBLE PATTERNS IN UNALIGNED PROTEIN SEQUENCES, Protein science, 4(8), 1995, pp. 1587-1595
Citations number
24
Categorie Soggetti
Biology
Journal title
ISSN journal
09618368
Volume
4
Issue
8
Year of publication
1995
Pages
1587 - 1595
Database
ISI
SICI code
0961-8368(1995)4:8<1587:FFPIUP>2.0.ZU;2-8
Abstract
We present a new method for the identification of conserved patterns i n a set of unaligned related protein sequences. It is able to discover patterns of a quite general form, allowing for both ambiguous positio ns and for variable length wildcard regions. It allows the user to def ine a class of patterns (e.g., the degree of ambiguity allowed and the length and number of gaps), and the method is then guaranteed to find the conserved patterns in this class scoring highest according to a s ignificance measure defined. Identified patterns may be refined using one of two new algorithms. We present a new (nonstatistical) significa nce measure for flexible patterns. The method is shown to recover know n motifs for PROSITE families and is also applied to some recently des cribed families from the literature.