This paper describes approaches to improve the performance of one of the mo
st common and increasingly important aspects of the Human Genome Project (H
GP)-large-volume, batch comparison of DNA sequence data. This basic compari
son operation, usually carried out by the well-known BLAST program on one s
ubject sequence against the internationally available databases of nearly f
ive million target sequences, is already used hundreds of thousands of time
s each day by researchers around the world. At present, it is still used pr
imarily in single query, or small batch query mode. As the entire sequence
of the human genome nears completion, the area of functional genomics, and
the use of micro-arrays of sets of genes, is coming to the fore. These deve
lopments will demand ever more efficient means of BLASTing sets of data tha
t will make single processor implementation on powerful workstations infeas
ible. We describe the three primary parallel components to BLAST. The first
is at the sequence-to-sequence comparison level. The second parallelizes a
single query across a partitioned and distributed database. Finally, the s
et of queries themselves are partitioned across a set of servers with repli
cated or partitioned databases. The three methods may be employed alone or
in concert. Our current implementation is described which parallelizes batc
h requests, and our plans for implementation of the other levels is also de
scribed. The results will ultimately be applied to hardware assistance for
this soon-to-be primitive computer operation. (C) 2001 Elsevier Science B.V
. All rights reserved.