ParAlign: a parallel sequence alignment algorithm for rapid and sensitive database searches

Authors
Citation
T. Rognes, ParAlign: a parallel sequence alignment algorithm for rapid and sensitive database searches, NUCL ACID R, 29(7), 2001, pp. 1647-1652
Citations number
26
Categorie Soggetti
Biochemistry & Biophysics
Journal title
NUCLEIC ACIDS RESEARCH
ISSN journal
03051048 → ACNP
Volume
29
Issue
7
Year of publication
2001
Pages
1647 - 1652
Database
ISI
SICI code
0305-1048(20010401)29:7<1647:PAPSAA>2.0.ZU;2-S
Abstract
There is a need for faster and more sensitive algorithms for sequence simil arity searching in view of the rapidly increasing amounts of genomic sequen ce data available. Parallel processing capabilities in the form of the sing le instruction, multiple data (SIMD) technology are now available in common microprocessors and enable a single microprocessor to perform many operati ons in parallel. The ParAlign algorithm has been specifically designed to t ake advantage of this technology. The new algorithm initially exploits para llelism to perform a very rapid computation of the exact optimal ungapped a lignment score for all diagonals in the alignment matrix. Then, a novel heu ristic is employed to compute an approximate score of a gapped alignment by combining the scores of several diagonals. This approximate score is used to select the most interesting database sequences for a subsequent Smith-Wa terman alignment, which is also parallelised, The resulting method represen ts a substantial improvement compared to existing heuristics. The sensitivi ty and specificity of ParAlign was found to be as good as Smith-Waterman im plementations when the same method for computing the statistical significan ce of the matches was used, In terms of speed, only the significantly less sensitive NCBI BLAST 2 program was found to outperform the new approach, On line searches are available at http://dna.uio.no/search/.