Aa. Schaffer et al., IMPALA: matching a protein sequence against a collection of PSI-BLAST-constructed position-specific score matrices, BIOINFORMAT, 15(12), 1999, pp. 1000-1011
Motivation: Many studies have shown that database searches using position-s
pecific score matrices (PSSMs) or profiles as queries are more effective at
identifying distant protein relationships than are searches that use simpl
e sequences as queries. One popular program for constructing a PSSM and com
paring it with a database of sequences is Position-Specific Iterated BLAST
(PSI-BLAST).
Results: This paper describes a new software package, IMPALA, designed for
the complementary procedure of comparing a single query sequence with a dat
abase of PSI-BLAST-generated PSSMs. We illustrate the use of IMPALA to sear
ch a database of PSSMs for protein folds, and one for protein domains invol
ved in signal transduction. IMPALA's sensitivity to distant biological rela
tionships is very similar to that of PSI-BLAST. However, IMPALA employs a m
ore refined analysis of statistical significance and, unlike PSI-BLAST, gua
rantees the output of the optimal local alignment by using the rigorous Smi
th-Waterman algorithm. Also, it is considerably faster when run with a larg
e database of PSSMs than is BLAST or PSI-BLAST when run against the complet
e non-redundant protein database.