An efficient and robust statistical modeling approach to discover differentially expressed genes using genomic expression profiles

Citation
Jg. Thomas et al., An efficient and robust statistical modeling approach to discover differentially expressed genes using genomic expression profiles, GENOME RES, 11(7), 2001, pp. 1227-1236
Citations number
83
Categorie Soggetti
Molecular Biology & Genetics
Journal title
GENOME RESEARCH
ISSN journal
10889051 → ACNP
Volume
11
Issue
7
Year of publication
2001
Pages
1227 - 1236
Database
ISI
SICI code
1088-9051(200107)11:7<1227:AEARSM>2.0.ZU;2-D
Abstract
We have developed a statistical regression modeling approach to discover ge nes that are differentially expressed between two predefined sample groups in DNA microarray experiments. Our model is based on well-defined assumptio ns, uses rigorous and well-characterized statistical measures, and accounts for the heterogeneity and genomic complexity of the data. In contrast to c luster analysis, which attempts to define groups of genes and/or samples th at share common overall expression profiles, our modeling approach uses kno wn sample group membership to Focus on expression profiles of individual ge nes in a sensitive and robust manner. Further, this approach can be used to test statistical hypotheses about gene expression. To demonstrate this met hodology, we compared the expression profiles of 11 acute myeloid leukemia (AML) and 27 acute lymphoblastic leukemia (ALL) samples From a previous stu dy (Golub et al. 1999) acid found 141 genes differentially expressed betwee n AML and ALL with a 1% significance at the genomic level. Using this model ing approach to compare different sample groups within the AML samples, we identified a group of genes whose expression profiles correlated with that of thrombopoietin and found that genes whose expression associated with AML treatment outcome lie in recurrent chromosomal locations. Our results are compared with those obtained using t-tests or Wilcoxon rank sum statistics.