Motivation: Gene annotation is the final goal of gene prediction algor
ithms?. However these algorithms frequently make mistakes and therefor
e the use of gene predictions for sequence annotation is hardly possib
le. As a result, biologists ma forced to conduct time-consuming gene i
dentification experiments by designing appropriate PCR primers to test
cDNA libraries or applying RT-PCR, exon trapping/amplification, or ot
her techniques. This process frequently amounts to 'guessing' PCR prim
ers on top of unreliable gene predictions and frequently leads to wast
ing of experimental efforts. Results: The present paper proposes a sim
ple and reliable algorithm for experimental gene identification which
by passes the unreliable gene prediction step. Studies of the performa
nce of the algorithm on a sample of human genes indicate that an exper
imental protocol based on the algorithms predictions achieves at? accu
rate gene identification with relatively few PCR primers. Predictions
of PCR primers may be used for exon amplification in preliminary! muta
tion analysis during ai? attempt to identify a gene responsible for a
disease. We propose a simple approach to find a short region from a ge
nomic sequence that with high probability overlaps with some exon of t
he gene. The algorithm is enhanced to find one or more segments that a
l-e probably contained hilt the translated region of the gene and can
be used as PCR primers to select appropriate clones in cDNA libraries
by selective amplification. The algorithm is further extended to locat
e a set of PCR primers that uniformly: cover ail translated regions an
ti can be used Soi RT-PCR and further sequencing of (unknown) mRNA. Av
ailability: The programs are implemented as Web servers (GenePrimer an
d CASSANDRA) and can be reached at http://www-hto.usc.edu/software/pro
crustes/ Contact: ssze@hto.usc.edu.