IMPALA: matching a protein sequence against a collection of PSI-BLAST-constructed position-specific score matrices

Citation
Aa. Schaffer et al., IMPALA: matching a protein sequence against a collection of PSI-BLAST-constructed position-specific score matrices, BIOINFORMAT, 15(12), 1999, pp. 1000-1011
Citations number
68
Categorie Soggetti
Multidisciplinary
Journal title
BIOINFORMATICS
ISSN journal
13674803 → ACNP
Volume
15
Issue
12
Year of publication
1999
Pages
1000 - 1011
Database
ISI
SICI code
1367-4803(199912)15:12<1000:IMAPSA>2.0.ZU;2-K
Abstract
Motivation: Many studies have shown that database searches using position-s pecific score matrices (PSSMs) or profiles as queries are more effective at identifying distant protein relationships than are searches that use simpl e sequences as queries. One popular program for constructing a PSSM and com paring it with a database of sequences is Position-Specific Iterated BLAST (PSI-BLAST). Results: This paper describes a new software package, IMPALA, designed for the complementary procedure of comparing a single query sequence with a dat abase of PSI-BLAST-generated PSSMs. We illustrate the use of IMPALA to sear ch a database of PSSMs for protein folds, and one for protein domains invol ved in signal transduction. IMPALA's sensitivity to distant biological rela tionships is very similar to that of PSI-BLAST. However, IMPALA employs a m ore refined analysis of statistical significance and, unlike PSI-BLAST, gua rantees the output of the optimal local alignment by using the rigorous Smi th-Waterman algorithm. Also, it is considerably faster when run with a larg e database of PSSMs than is BLAST or PSI-BLAST when run against the complet e non-redundant protein database.