ITA
ENG

Linear programming boosting via column generation

Authors

Demiriz, A Bennett, KP Shawe-Taylor, J

Citation

A. Demiriz et al., Linear programming boosting via column generation, MACH LEARN, 46(1-3), 2002, pp. 225-254

Citations number

Categorie Soggetti

AI Robotics and Automatic Control

Journal title

MACHINE LEARNING

ISSN journal

08856125 → ACNP

Volume

Issue

1-3

Year of publication

2002

Pages

225 - 254

Database

ISI

SICI code

0885-6125(2002)46:1-3<225:LPBVCG>2.0.ZU;2-L

Abstract

We examine linear program (LP) approaches to boosting and demonstrate their efficient solution using LPBoost, a column generation based simplex method . We formulate the problem as if all possible weak hypotheses had already b een generated. The labels produced by the weak hypotheses become the new fe ature space of the problem. The boosting task becomes to construct a learni ng function in the label space that minimizes misclassification error and m aximizes the soft margin. We prove that for classification, minimizing the 1-norm soft margin error function directly optimizes a generalization error bound. The equivalent linear program can be efficiently solved using colum n generation techniques developed for large-scale optimization problems. Th e resulting LPBoost algorithm can be used to solve any LP boosting formulat ion by iteratively optimizing the dual misclassification costs in a restric ted LP and dynamically generating weak hypotheses to make new LP columns. W e provide algorithms for soft margin classification, confidence-rated, and regression boosting problems. Unlike gradient boosting algorithms, which ma y converge in the limit only, LPBoost converges in a finite number of itera tions to a global solution satisfying mathematically well-defined optimalit y conditions. The optimal solutions of LPBoost are very sparse in contrast with gradient based methods. Computationally, LPBoost is competitive in qua lity and computational cost to AdaBoost.