THEORY AND PRACTICE OF VECTOR QUANTIZERS TRAINED ON SMALL TRAINING SETS

Citation
D. Cohn et al., THEORY AND PRACTICE OF VECTOR QUANTIZERS TRAINED ON SMALL TRAINING SETS, IEEE transactions on pattern analysis and machine intelligence, 16(1), 1994, pp. 54-65
Citations number
20
Categorie Soggetti
Computer Sciences","Computer Science Artificial Intelligence","Engineering, Eletrical & Electronic
ISSN journal
01628828
Volume
16
Issue
1
Year of publication
1994
Pages
54 - 65
Database
ISI
SICI code
0162-8828(1994)16:1<54:TAPOVQ>2.0.ZU;2-5
Abstract
We examine how the performance of a memoryless vector quantizer change s as a function of its training set size. Specifically, we study how w ell the training set distortion predicts test distortion when the trai ning set is a randomly drawn subset of blocks from the test or trainin g image(s). Using the Vapnik-Chervonenkis (VC) dimension, we derive fo rmal bounds for the difference of test and training distortion of vect or quantizer codebooks. We then describe extensive empirical simulatio ns that test these bounds for a variety of codebook sizes and vector d imensions, and give practical suggestions for determining the training set size necessary to achieve good generalization from a codebook. We conclude that, by using training sets comprised of only a small fract ion of the available data, one can produce results that are close to t he results obtainable when all available data are used.