Model complexity control for regression using VC generalization bounds

Citation
V. Cherkassky et al., Model complexity control for regression using VC generalization bounds, IEEE NEURAL, 10(5), 1999, pp. 1075-1089
Citations number
23
Categorie Soggetti
AI Robotics and Automatic Control
Journal title
IEEE TRANSACTIONS ON NEURAL NETWORKS
ISSN journal
10459227 → ACNP
Volume
10
Issue
5
Year of publication
1999
Pages
1075 - 1089
Database
ISI
SICI code
1045-9227(199909)10:5<1075:MCCFRU>2.0.ZU;2-4
Abstract
It is well known that for a given sample size there exists a model of optim al complexity corresponding to the smallest prediction (generalization) err or. Hence, any method for learning from finite samples needs to have some p rovisions for complexity control. Existing implementations of complexity co ntrol include penalization (or regularization), weight decay tin neural net works), and various greedy procedures (aka constructive, growing, or prunin g methods). There are numerous proposals for determining optimal model comp lexity (aka model selection) based on various (asymptotic) analytic estimat es of the prediction risk and on resampling approaches. Nonasymptotic bound s on the prediction risk based on Vapnik-Chervonenkis (VC)-theory have been proposed by Vapnik. This paper describes application of VC-bounds to regre ssion problems with the usual squared loss. An empirical study is performed for settings where the VC-bounds can be rigorously applied, i.e., linear m odels and penalized linear models where the VC-dimension can be accurately estimated, and the empirical risk can be reliably minimized. Empirical comp arisons between model selection using VC-bounds and classical methods are p erformed for various noise levels, sample size, target functions and types of approximating functions. Our results demonstrate the advantages of VC-ba sed complexity control with finite samples.