PROF_PAT 1.3: Updated database of patterns used to detect local similarities

Citation
Ag. Bachinsky et al., PROF_PAT 1.3: Updated database of patterns used to detect local similarities, BIOINFORMAT, 16(4), 2000, pp. 358-366
Citations number
19
Categorie Soggetti
Multidisciplinary
Journal title
BIOINFORMATICS
ISSN journal
13674803 → ACNP
Volume
16
Issue
4
Year of publication
2000
Pages
358 - 366
Database
ISI
SICI code
1367-4803(200004)16:4<358:P1UDOP>2.0.ZU;2-F
Abstract
Motivation: When analysing novel protein sequences, it is now essential to extend search strategies to include a range of 'secondary' databases. Patte rn databases have become vital tools for identifying distant relationships in sequences, and hence for predicting protein function and structure. The main drawback of such methods is the relatively small representation of pro teins in trial samples at the time of their construction. Therefore, a nega tive result of an amino acid sequence comparison with such a databank force s a researcher to search for similarities in the original protein banks. We developed a database of patterns constructed for groups of related protein s with maximum representation of amino acid sequences of SWISS-PROT in the groups. Results: Software tools and a new method have been designed to construct pa tterns of protein families. By using such method, a new version of databank of protein family patterns, PROF_PAT 1.3, is produced. This bank is based on SWISS-PROT (r1.38) and TrEMBL (r1.11), and contains patterns of more tha n 13 000 groups of related proteins in a format similar to that of the PROS ITE. Motifs of patterns, which had the minimum level of probability to be f ound in random sequences, were selected. Flexible fast search program accom panies the bank. The researcher can specify a similarity matrix (the type P AM, BLOSUM and other). Variable levels of similarity can be set (permitting search strategies ranging from exact matches to increasing levels of 'fuzz iness').