The Cancer Gene Anatomy Project database of the National Cancer Institute h
as thousands of expressed sequences, both known and novel, in the form of e
xpressed sequence tags (ESTs), These ESTs, derived from diverse normal and
tumor cDNA libraries, offer an attractive starting point for cancer gene di
scovery. Using a data-mining tool called Digital Differential Display (DDD)
from the Cancer Gene Anatomy Project database, ESTs from six different sol
id tumor types (breast, colon, lung, ovary, pancreas, and prostate) were an
alyzed for differential expression. An electronic expression profile and ch
romosomal map position of these hits were generated from the Unigene databa
se. The hits were categorized into major classes of genes including ribosom
al proteins, enzymes, cell surface molecules, secretory proteins, adhesion
molecules, and immunoglobulins and were found to be differentially expresse
d in these tumor-derived libraries. Genes known to be up-regulated in prost
ate, breast, and pancreatic carcinomas were discovered by DDD, demonstratin
g the utility of this technique. Two hundred known genes and 500 novel sequ
ences were discovered to be differentially expressed in these select tumor-
derived libraries. Test genes were validated for expression specificity by
reverse transcription-PCR, providing a proof of concept for gene discovery
by DDD. A comprehensive database of hits can be accessed at http://www.fau.
edu/cmbb/publications/cancergene.htm. This solid tumor DDD database should
facilitate target identification for cancer diagnostics and therapeutics.