In recent years improvements to existing programs and the introduction of n
ew iterative algorithms have changed the state-of-the-art in protein sequen
ce alignment. This paper presents the first systematic study of the most co
mmonly used alignment programs using BAliBASE benchmark alignments as test
cases, Even below the 'twilight zone' at 10-20% residue identity, the best
programs were capable of correctly aligning on average 47% of the residues.
We show that iterative algorithms often offer improved alignment accuracy
though at the expense of computation time, A notable exception was the effe
ct of introducing a single divergent sequence into a set of closely related
sequences, causing the iteration to diverge away from the best alignment.
Global alignment programs generally performed better than local methods, ex
cept in the presence of large N/C-terminal extensions and internal insertio
ns. In these cases, a local algorithm was more successful in identifying th
e most conserved motifs, This study enables us to propose appropriate align
ment strategies, depending on the nature of a particular set of sequences.
The employment of more than one program based on different alignment techni
ques should significantly improve the quality of automatic protein sequence
alignment methods. The results also indicate guidelines for improvement of
alignment algorithms.