The rapid discovery of sequence information from the Human Genome Project h
as exponentially increased the amount of data that can be retrieved from bi
omedical experiments. Gene expression profiling, through the use of microar
ray technology, is rapidly contributing to an improved understanding of glo
bal, coordinated cellular events in a variety of paradigms. In the field of
toxicology, the potential application of toxicogenomics to indicate the to
xicity of unknown compounds has been suggested but remains largely unsubsta
ntiated to date. A major supposition of toxicogenomics is that global chang
es in the expression of individual mRNAs (i.e., the transcriptional respons
es of cells to toxicants) will be sufficiently distinct, robust, and reprod
ucible to allow discrimination of toxicants from different classes. Definit
ive demonstration is still lacking for such specific "genetic fingerprints,
" as opposed to nonspecific general stress responses that may be indistingu
ishable between compounds and therefore not suitable as probes of toxic mec
hanisms. The present studies demonstrate a general application of toxicogen
omics that distinguishes two mechanistically unrelated classes of toxicants
(cytotoxic anti-inflammatory drugs and DNA-damaging agents) based solely u
pon a cluster-type analysis of genes differentially induced or repressed in
cultured cells during exposure to these compounds. Initial comparisons of
the expression patterns for 100 toxic compounds, using all similar to 250 g
enes on a DNA microarray(similar to 2.5 million data points), failed to dis
criminate between toxicant classes. A major obstacle encountered in these s
tudies was the lack of reproducible gene responses, presumably due to biolo
gical variability and technological limitations. Thus multiple replicate ob
servations for the prototypical DNA damaging agent, cisplatin, and the non-
steroidal anti-inflammatory drugs (NSAIDs) diflunisal and flufenamic acid w
ere made, and a subset of genes yielding reproducible inductions/repression
s was selected for comparison. Many of the "fingerprint genes" identified i
n these studies were consistent with previous observations reported in the
literature (e.g., the well-characterized induction by cisplatin of p53-regu
lated transcripts such as p21(waf1/cip1) and PCNA [proliferating cell nucle
ar antigen]). These gene subsets not only discriminated among the three com
pounds in the learning set but also showed predictive value for the rest of
the database (similar to 100 compounds of various toxic mechanisms). Furth
er refinement of the clustering strategy, using a computer-based optimizati
on algorithm, yielded even better results and demonstrated that genes that
ultimately best discriminated between DNA damage and NSAIDs were involved i
n such diverse processes as DNA repair, xenobiotic metabolism, transcriptio
nal activation, structural maintenance, cell cycle control, signal transduc
tion, and apoptosis. The determination of genes whose responses appropriate
ly group and dissociate anti-inflammatory versus DNA-damaging agents provid
es an initial paradigm upon which to build for future, higher throughput-ba
sed identification of toxic compounds using gene expression patterns alone.