Breast cancer is characterized by an important histoclinical heterogeneity
that currently hampers the selection of the most appropriate treatment for
each case. This problem could be solved by the identification of new parame
ters that better predict the natural history of the disease and its sensiti
vity to treatment. A large-scale molecular characterization of breast cance
r could help in this context. Using cDNA arrays, we studied the quantitativ
e mRNA expression levels of 176 candidate genes in 34 primary breast carcin
omas along three directions: comparison of tumor samples, correlations of m
olecular data with conventional histoclinical prognostic features and gene
correlations. The study evidenced extensive heterogeneity of breast tumors
at the transcriptional level. A hierarchical clustering algorithm identifie
d two molecularly distinct subgroups of tumors characterized by a different
clinical outcome after chemotherapy. This outcome could not have been pred
icted by the commonly used histoclinical parameters. No correlation was fou
nd with the age of patients, tumor size, histological type and grade. Howev
er, expression of genes was differential in tumors with lymph node metastas
is and according to the estrogen receptor status; ERBB2 expression was stro
ngly correlated with the lymph node status (P < 0.0001) and that of GATA3 w
ith the presence of estrogen receptors (P < 0.001). Thus, our results ident
ified new ways to group tumors according to outcome and new potential targe
ts of carcinogenesis. They show that the systematic use of cDNA array testi
ng holds great promise to improve the classification of breast cancer in te
rms of prognosis and chemosensitivity and to provide new potential therapeu
tic targets.