ITA
ENG

Adaptive model selection using empirical complexities

Authors

Lugosi, G Nobel, AB

Citation

G. Lugosi et Ab. Nobel, Adaptive model selection using empirical complexities, ANN STATIST, 27(6), 1999, pp. 1830-1864

Citations number

Categorie Soggetti

Mathematics

Journal title

ANNALS OF STATISTICS

ISSN journal

00905364 → ACNP

Volume

Issue

Year of publication

1999

Pages

1830 - 1864

Database

ISI

SICI code

0090-5364(199912)27:6<1830:AMSUEC>2.0.ZU;2-G

Abstract

Given n independent replicates of a jointly distributed pair (X, Y) epsilon R-d x R x , we wish to select from a fixed sequence of model classes F-1, F-2,... a deterministic prediction rule f: R-d --> R whose risk is small. W E investigate the possibility of empirically assessing the complexity of ea ch model class, that is, the actual difficulty of the estimation problem wi thin each class. The estimated complexities are in turn used to define an a daptive model selection procedure, which is based on complexity penalized e mpirical risk. The available data are divided into two parts. The first is used to form an empirical cover of each model class, and the second is used to select a ca ndidate rule from each cover based on empirical risk. The covering radii ar e determined empirically to optimize a tight upper bound on the estimation error. An estimate is chosen from the list of candidates in order to minimi ze the sum of class complexity and empirical risk. A distinguishing feature of the approach is that the complexity of each model class is assessed emp irically, based on the size of its empirical cover. Finite sample performance bounds are established for the estimates, and the se bounds are applied to several nonparametric estimation problems. The est imates are shown to achieve a favorable trade-off between approximation and estimation error and to perform as well as if the distribution-dependent c omplexities of the model classes were known beforehand. In addition, it is shown that the estimate can be consistent, and even possess near optimal ra tes of convergence, when each model class has an infinite VC or pseudo dime nsion. For regression estimation with squared loss we modify our estimate to achie ve a faster rate of convergence.