External memory (EM) algorithms are designed for computational problems in
which the size of the internal memory of the computer is only a small fract
ion of the problem size. Blockwise access to data is a central theme in the
design of efficient EM algorithms, A similar requirement arises in the tra
nsmission of data between processors in high performance parallel computati
on systems, for which blockwise communication is a crucial issue.
We consider multisearch problems, where a large number of queries are to be
simultaneously processed and satisfied by navigating through large data st
ructures on parallel computers, Our examples originate as algorithms for pa
rallel machines, and we adapt them to the EM situation where the queries an
d data structure are considered to be much larger than the size of the avai
lable internal memory.
This paper presents techniques to achieve blocking for I/O as well as for c
ommunication in multisearch on the BSP and EM-BSP models. We describe impro
vements to the 1-optimal BSP* multisearch algorithm of [8] which permit lar
ger search trees to be handled.
In the area of EM algorithms new algorithms for multisearch in balanced tre
es are described. For search trees of size O (n log n) where n is the numbe
r of queries, we obtain a work-optimal, parallel, randomized EM multisearch
algorithm whose I/O and communication time are smaller, asymptotically, th
an the computation time. We obtain a deterministic version via a similar te
chnique. These algorithms are obtained via the simulation techniques of [12
], [17], [13], [14], and [24].
For larger trees we describe a parallel, EM algorithm which is I-optimal co
nsidering computation, communication, and I/O (for suitable parameter const
ellations) plus I/O-optimal. We give a lower bound to the number of I/O ope
rations required for filtering n queries through a binary or multiway searc
h tree of size m when m greater than or equal to n(2 + epsilon) for a const
ant epsilon > 0.