A comprehensive comparison of multiple sequence alignment programs

Citation
Jd. Thompson et al., A comprehensive comparison of multiple sequence alignment programs, NUCL ACID R, 27(13), 1999, pp. 2682-2690
Citations number
30
Categorie Soggetti
Biochemistry & Biophysics
Journal title
NUCLEIC ACIDS RESEARCH
ISSN journal
03051048 → ACNP
Volume
27
Issue
13
Year of publication
1999
Pages
2682 - 2690
Database
ISI
SICI code
0305-1048(19990701)27:13<2682:ACCOMS>2.0.ZU;2-1
Abstract
In recent years improvements to existing programs and the introduction of n ew iterative algorithms have changed the state-of-the-art in protein sequen ce alignment. This paper presents the first systematic study of the most co mmonly used alignment programs using BAliBASE benchmark alignments as test cases, Even below the 'twilight zone' at 10-20% residue identity, the best programs were capable of correctly aligning on average 47% of the residues. We show that iterative algorithms often offer improved alignment accuracy though at the expense of computation time, A notable exception was the effe ct of introducing a single divergent sequence into a set of closely related sequences, causing the iteration to diverge away from the best alignment. Global alignment programs generally performed better than local methods, ex cept in the presence of large N/C-terminal extensions and internal insertio ns. In these cases, a local algorithm was more successful in identifying th e most conserved motifs, This study enables us to propose appropriate align ment strategies, depending on the nature of a particular set of sequences. The employment of more than one program based on different alignment techni ques should significantly improve the quality of automatic protein sequence alignment methods. The results also indicate guidelines for improvement of alignment algorithms.