ITA
ENG

Tissue classification with gene expression profiles

Authors

Ben-Dor, A Bruhn, L Friedman, N Nachman, I Schummer, M Yakhini, Z

Citation

A. Ben-dor et al., Tissue classification with gene expression profiles, J COMPUT BI, 7(3-4), 2000, pp. 559-583

Citations number

Categorie Soggetti

Biochemistry & Biophysics

Journal title

JOURNAL OF COMPUTATIONAL BIOLOGY

ISSN journal

10665277 → ACNP

Volume

Issue

3-4

Year of publication

2000

Pages

559 - 583

Database

ISI

SICI code

1066-5277(2000)7:3-4<559:TCWGEP>2.0.ZU;2-H

Abstract

Constantly improving gene expression profiling technologies are expected to provide understanding and insight into cancer-related cellular processes. Gene expression data is also expected to significantly aid in the developme nt of efficient cancer diagnosis and classification platforms. In this work we examine three sets of gene expression data measured across sets of tumo r(s) and normal clinical samples: The first set consists of 2,000 genes, me asured in 62 epithelial colon samples (Alon et al., 1999). The second consi sts of approximate to 100,000 clones, measured in 32 ovarian samples (unpub lished extension of data set described in Schummer et al, (1999)). The thir d set consists of approximate to 7,100 genes, measured in 72 bone marrow an d peripheral brood samples (Golub et al., 1999). We examine the use of scor ing methods, measuring separation of tissue type (e.g., tumors from normals ) using individual gene expression levels. These are then coupled with high -dimensional classification methods to assess the classification power of c omplete expression profiles. We present results of performing leave-one-one cross validation (LOOCV) experiments on the three data sets, employing nea rest neighbor classifier, SVM (Cortes and Vapnik, 1995), AdaBoost (Freund a nd Schapire, 1997) and a novel clustering-based classification technique. A s tumor samples can differ from normal samples in their cell-type compositi on, we also perform LOOCV experiments using appropriately modified sets of genes, attempting to eliminate the resulting bias. We demonstrate success r ate of at least 90% in tumor versus normal classification, using sets of se lected genes, with, as well as without, cellular-contamination-related memb ers. These results are insensitive to the exact selection mechanism, over a certain range.