Motivation: High density DNA oligo microarrays are widely used in biomedica
l research. Selection of optimal DNA oligos that are deposited on the micro
arrays is critical. Based on sequence information and hybridization free en
ergy, we developed a new algorithm to select optimal short (20-25 bases) or
long (50 or 70 bases) oligos from genes or open reading frames (ORFs) and
predict their hybridization behavior. Having optimized probes for each gene
is valuable for two reasons. By minimizing background hybridization they p
rovide more accurate determinations of true expression levels. Having optim
um probes minimizes the number of probes needed per gene, thereby decreasin
g the cost of each microarray, raising the number of genes on each chip and
increasing its usage.
Results: In this paper we describe algorithms to optimize the selection of
specific probes for each gene in an entire genome. The criteria for truly o
ptimum probes are easily stated but they are not computable at all levels c
urrently. We have developed an heuristic approach that is efficiently compu
table at all levels and should provide a good approximation to the true opt
imum set. We have run the program on the complete genomes for several model
organisms and deposited the results in a database that is available on-lin
e (http://ural.wustl.edu/-lif/probe.pl).