Linear programming boosting via column generation

Citation
A. Demiriz et al., Linear programming boosting via column generation, MACH LEARN, 46(1-3), 2002, pp. 225-254
Citations number
26
Categorie Soggetti
AI Robotics and Automatic Control
Journal title
MACHINE LEARNING
ISSN journal
08856125 → ACNP
Volume
46
Issue
1-3
Year of publication
2002
Pages
225 - 254
Database
ISI
SICI code
0885-6125(2002)46:1-3<225:LPBVCG>2.0.ZU;2-L
Abstract
We examine linear program (LP) approaches to boosting and demonstrate their efficient solution using LPBoost, a column generation based simplex method . We formulate the problem as if all possible weak hypotheses had already b een generated. The labels produced by the weak hypotheses become the new fe ature space of the problem. The boosting task becomes to construct a learni ng function in the label space that minimizes misclassification error and m aximizes the soft margin. We prove that for classification, minimizing the 1-norm soft margin error function directly optimizes a generalization error bound. The equivalent linear program can be efficiently solved using colum n generation techniques developed for large-scale optimization problems. Th e resulting LPBoost algorithm can be used to solve any LP boosting formulat ion by iteratively optimizing the dual misclassification costs in a restric ted LP and dynamically generating weak hypotheses to make new LP columns. W e provide algorithms for soft margin classification, confidence-rated, and regression boosting problems. Unlike gradient boosting algorithms, which ma y converge in the limit only, LPBoost converges in a finite number of itera tions to a global solution satisfying mathematically well-defined optimalit y conditions. The optimal solutions of LPBoost are very sparse in contrast with gradient based methods. Computationally, LPBoost is competitive in qua lity and computational cost to AdaBoost.