ITA
ENG

On combinatorial DNA word design

Authors

Marathe, A Condon, AE Corn, RM

Citation

A. Marathe et al., On combinatorial DNA word design, J COMPUT BI, 8(3), 2001, pp. 201-219

Citations number

Categorie Soggetti

Biochemistry & Biophysics

Journal title

JOURNAL OF COMPUTATIONAL BIOLOGY

ISSN journal

10665277 → ACNP

Volume

Issue

Year of publication

2001

Pages

201 - 219

Database

ISI

SICI code

1066-5277(2001)8:3<201:OCDWD>2.0.ZU;2-7

Abstract

We consider the problem of designing DNA codes, namely sets of equi-length words over the alphabet {A, C, G, T} that satisfy certain combinatorial con straints. This problem is motivated by the task of reliably storing and ret rieving information in synthetic DNA strands for use in DNA computing or as molecular bar codes in chemical libraries. The primary constraints that we consider, defined with respect to a parameter d, are as follows: for every pair of words w, x in a code, there are at least d mismatches between w an d x if w not equal x and also between the reverse of w and the Watson-Crick complement of x. Extending classical results from coding theory, we presen t several upper and lower bounds on the maximum size of such DNA codes and give methods for constructing such codes. An additional constraint that is relevant to the design of DNA codes is that the free energies and enthalpie s of the code words, and thus the melting temperatures, be similar. We desc ribe dynamic programming algorithms that can (a) calculate the total number of words of length n whose free energy value, as approximated by a formula of Breslauer et al. (1986) falls in a given range, and (b) output a random such word. These algorithms are intended for use in heuristic algorithms f or constructing DNA codes.