We consider the problem of designing DNA codes, namely sets of equi-length
words over the alphabet {A, C, G, T} that satisfy certain combinatorial con
straints. This problem is motivated by the task of reliably storing and ret
rieving information in synthetic DNA strands for use in DNA computing or as
molecular bar codes in chemical libraries. The primary constraints that we
consider, defined with respect to a parameter d, are as follows: for every
pair of words w, x in a code, there are at least d mismatches between w an
d x if w not equal x and also between the reverse of w and the Watson-Crick
complement of x. Extending classical results from coding theory, we presen
t several upper and lower bounds on the maximum size of such DNA codes and
give methods for constructing such codes. An additional constraint that is
relevant to the design of DNA codes is that the free energies and enthalpie
s of the code words, and thus the melting temperatures, be similar. We desc
ribe dynamic programming algorithms that can (a) calculate the total number
of words of length n whose free energy value, as approximated by a formula
of Breslauer et al. (1986) falls in a given range, and (b) output a random
such word. These algorithms are intended for use in heuristic algorithms f
or constructing DNA codes.