Sr. Sunyaev et al., PSIC: profile extraction from sequence alignments with position-specific counts of independent observations, PROTEIN ENG, 12(5), 1999, pp. 387-394
Sequence weighting techniques are aimed at balancing redundant observed inf
ormation from subsets of similar sequences in multiple alignments. Traditio
nal approaches apply the same weight to all positions of a given sequence,
hence equal efficiency of phylogenetic changes is assumed along the whole s
equence. This restrictive assumption is not required for the new method PSI
C (position-specific independent counts) described in this paper. The numbe
r of independent observations (counts) of an amino acid type at a given ali
gnment position is calculated from the overall similarity of the sequences
that share the amino acid type at this position with the help of statistica
l concepts. This approach allows the fast computation of position-specific
sequence weights even for alignments containing hundreds of sequences. The
PSIC approach has been applied to profile extraction and to the fold family
assignment of protein sequences with known structures. Our method was show
n to be very productive in finding distantly related sequences and more pow
erful than Hidden Markov Models or the profile methods in WiseTools and PSI
-BLAST in many cases. The profile extraction routine is available on the WW
W (http://www.bork.embl-heidelberg.de/PSIC or http://www.imb.ac.ru/PSIC).