We used cDNA microarrays to explore the variation in expression of approxim
ately 8,000 unique genes among the 60 cell lines used in the National Cance
r Institute's screen for anti-cancer drugs. Classification of the cell line
s based solely on the observed patterns of gene expression revealed a corre
spondence to the ostensible origins of the tumours from which the cell line
s were derived. The consistent relationship between the gene expression pat
terns and the tissue of origin allowed us to recognize outliers whose previ
ous classification appeared incorrect. Specific features of the gene expres
sion patterns appeared to be related to physiological properties of the cel
l lines, such as their doubling time in culture, drug metabolism or the int
erferon response. Comparison of gene expression patterns in the cell lines
to those observed in normal breast tissue or in breast tumour specimens rev
ealed features of the expression patterns in the tumours that had recogniza
ble counterparts in specific cell lines, reflecting the tumour, stromal and
inflammatory components of the tumour tissue. These results provided a nov
el molecular characterization of this important group of human cell lines a
nd their relationships to tumours in vivo.