This paper is concerned with the development of techniques for massively pa
rallel computation at the molecular scale, which we refer to as molecular p
arallelism. While this may at first appear to be purely science fiction, Ad
leman [Ad1] has already employed molecular parallelism in the solution of t
he Hamiltonian path problem, and successfully tested his techniques in a la
b experiment on DNA for a small graph. Lipton [L] showed that finding the s
atisfying inputs to a Boolean expression of size n can be done in O (n) lab
steps using DNA of length O (n log n) base pairs. This recent work by Adle
man and Lipton in molecular parallelism considered only the solution of NP
search problems, and provided no way of quickly executing lengthy computati
ons by purely molecular means; the number of lab steps depended linearly on
the size of the simulated expression. See [Re3] for further recent work on
molecular parallelism and see [Re4] for an extensive survey of molecular p
arallelism.
Our goal is to execute lengthy computations quickly by the use of molecular
parallelism. We wish to execute these biomolecular computations using shor
t DNA strands by more or less conventional biotechnology engineering techni
ques within a small number of lab steps. This paper describes techniques fo
r achieving this goal, in the context of well defined abstract models of bi
omolecular computation. Although our results are of theoretical consequence
only due to the large amount of molecular parallelism (i.e., large test tu
be volume) required, we believe that our theoretical models and results may
be a basis for more practical later work, just as was done in the area of
parallel computing.
We propose two abstract models of biomolecular computation. The first, the
Parallel Associative Memory (PAM) model, is a very high-level model which i
ncludes a Parallel Associative Matching (PA-Match) operation, that appears
to improve the power of molecular parallelism beyond the operations previou
sly considered by Lipton [L]. We give some simulations of conventional sequ
ential and parallel computational models by our PAM model. Each of the simu
lations use strings of length O (s) over an alphabet of size O (s) (which c
orrespond to DNA of length O (s log s) base pairs). Using O (s log s) PAM o
perations that are not PA-Match (or O (s) operations assuming a ligation op
eration) and t PA-Match operations, we can:
1. simulate a nondeterministic Turing Machine computation with space bound
s and time bound 2(sigma (s)), with r = O (s),
2. simulate a CREW PRAM with time bound D, with M memory cells, and process
or bound P, where here s = O (log(PM)) and t = O (D + s),
3. find the satisfying inputs to a Boolean circuit constructible in s space
with n inputs, unbounded fan-out, and depth D, where here t = O (D + s).
We also propose a Recombinant DNA (RDNA) model which is a low-level model t
hat allows operations that are abstractions of very well understood recombi
nant DNA operations and provides a representation, which we call the comple
x, for the relevant structural properties of DNA. The PA-Match operation fo
r lengthy strings of length s cannot be feasibly implemented by recombinant
DNA techniques directly by a single step of complementary pairing in DNA;
nevertheless we show this Matching operation can be simulated in the RDNA m
odel with O (s) slowdown by multiple steps of complementary pairing of subs
trings of length 2 (corresponding to logarithmic length DNA subsequences).
Each of the other operations of the PAM model can be executed in our RDNA m
odel, without slowdown.
We further show that, with a further O (s) / log(1/epsilon) slowdown, the s
imulations can be done correctly with probability 1/2 even if certain recom
binant DNA operations (e.g., Separation) can error with a probability epsil
on. We also observe efficient simulations can be done by PRAMs and thus Tur
ing Machines of our molecular models.