Support vector machine classification and validation of cancer tissue samples using microarray expression data

Citation
Ts. Furey et al., Support vector machine classification and validation of cancer tissue samples using microarray expression data, BIOINFORMAT, 16(10), 2000, pp. 906-914
Citations number
33
Categorie Soggetti
Multidisciplinary
Journal title
BIOINFORMATICS
ISSN journal
13674803 → ACNP
Volume
16
Issue
10
Year of publication
2000
Pages
906 - 914
Database
ISI
SICI code
1367-4803(200010)16:10<906:SVMCAV>2.0.ZU;2-2
Abstract
Motivation: DNA microarray experiments generating thousands of gene express ion measurements, are being used to gather information from tissue and cell samples regarding gene expression differences that will be useful in diagn osing disease. We have developed a new method to analyse this kind of data using support vector machines (SVMs). This analysis consists of both classi fication of the tissue samples, and an exploration of the data for mis-labe led or questionable tissue results. Results: We demonstrate the method in detail on samples consisting of ovari an cancer tissues, normal ovarian tissues, and other normal tissues. The da taset consists of expression experiment results for 97 802 cDNAs for each t issue. As a result of computational analysis, a tissue sample is discovered and confirmed to be wrongly labeled Upon correction of this mistake and th e removal of an outlier perfect classification of tissues is achieved, but not with high confidence. We identify and analyse a subset of genes from th e ovarian dataset whose expression is highly differentiated between the typ es of tissues. To show robustness of the SVM method, two previously publish ed datasets from other types of tissues or cells are analysed The results a re comparable to those previously obtained. We show that other machine lear ning methods also perform comparably to the SVM on many of those datasets.