On combinatorial DNA word design

Citation
A. Marathe et al., On combinatorial DNA word design, J COMPUT BI, 8(3), 2001, pp. 201-219
Citations number
31
Categorie Soggetti
Biochemistry & Biophysics
Journal title
JOURNAL OF COMPUTATIONAL BIOLOGY
ISSN journal
10665277 → ACNP
Volume
8
Issue
3
Year of publication
2001
Pages
201 - 219
Database
ISI
SICI code
1066-5277(2001)8:3<201:OCDWD>2.0.ZU;2-7
Abstract
We consider the problem of designing DNA codes, namely sets of equi-length words over the alphabet {A, C, G, T} that satisfy certain combinatorial con straints. This problem is motivated by the task of reliably storing and ret rieving information in synthetic DNA strands for use in DNA computing or as molecular bar codes in chemical libraries. The primary constraints that we consider, defined with respect to a parameter d, are as follows: for every pair of words w, x in a code, there are at least d mismatches between w an d x if w not equal x and also between the reverse of w and the Watson-Crick complement of x. Extending classical results from coding theory, we presen t several upper and lower bounds on the maximum size of such DNA codes and give methods for constructing such codes. An additional constraint that is relevant to the design of DNA codes is that the free energies and enthalpie s of the code words, and thus the melting temperatures, be similar. We desc ribe dynamic programming algorithms that can (a) calculate the total number of words of length n whose free energy value, as approximated by a formula of Breslauer et al. (1986) falls in a given range, and (b) output a random such word. These algorithms are intended for use in heuristic algorithms f or constructing DNA codes.