Boosting with missing predictors

Citation
Wang, C.y et Feng, Ziding, Boosting with missing predictors, Biostatistics (Oxford. Print) , 11(2), 2010, pp. 195-212
ISSN journal
14654644
Volume
11
Issue
2
Year of publication
2010
Pages
195 - 212
Database
ACNP
SICI code
Abstract
Boosting is an important tool in classification methodology. It combines the performance of many weak classifiers to produce a powerful committee, and its validity can be explained by additive modeling and maximum likelihood.The method has very general applications, especially for high-dimensional predictors.For example, it can be applied to distinguish cancer samples from healthy control samples by using antibody microarray data.Microarray data are often high-dimensional and many of them are incomplete.One natural idea is to impute a missing variable based on the observed predictors.However, the calculation of imputation for high-dimensional predictors with missing data may be rather tedious. In this paper, we propose 2 conditional mean imputation methods.They can be applied to the situation even when a complete-case subset does not exist.Simulation results indicate that the proposed methods are superior than other naive methods.We apply the methods to a pancreatic cancer study in which serum protein microarrays are used for classification.