In this paper we study the problem of simulating shared memory on the distr
ibuted memory machine (DMM). Our approach uses multiple copies of shared me
mory cells, distributed among the memory modules of the DMM via universal h
ashing. The main aim is to design strategies that resolve contention at the
memory modules. Extending results and methods from random graphs and very
fast randomized algorithms, we present new simulation techniques that enabl
e us to improve the previously best results exponentially. In particular, w
e show that an n-processor CRCW PRAM can be simulated by an n-processor DMM
with delay O(log log log n log* n), with high probability.
Next we describe a general technique that can be used to turn these simulat
ions into time-processor optimal ones, in the case of EREW PRAMs to be simu
lated. We obtain a time-processor optimal simulation of an (n log log log n
log* n)-processor EREW PRAM on an n-processor DMM with delay O( log log lo
g n log* n), with high probability. When an (n log log log n log* n)-proces
sor CRCW PRAM is simulated, the delay is only by a log* n factor larger.
We further demonstrate that the simulations presented can not be significan
tly improved using our techniques. We show an Omega(log log log n/log log l
og log n) lower bound on the expected delay for a class of PRAM simulations
, called topological simulations, that covers all previously known simulati
ons as well as the simulations presented in the paper.