THE RELATIVE VALUE OF LABELED AND UNLABELED SAMPLES IN PATTERN-RECOGNITION WITH AN UNKNOWN MIXING PARAMETER

Citation
V. Castelli et Tm. Cover, THE RELATIVE VALUE OF LABELED AND UNLABELED SAMPLES IN PATTERN-RECOGNITION WITH AN UNKNOWN MIXING PARAMETER, IEEE transactions on information theory, 42(6), 1996, pp. 2102-2117
Citations number
28
Categorie Soggetti
Information Science & Library Science","Engineering, Eletrical & Electronic
ISSN journal
00189448
Volume
42
Issue
6
Year of publication
1996
Part
2
Pages
2102 - 2117
Database
ISI
SICI code
0018-9448(1996)42:6<2102:TRVOLA>2.0.ZU;2-R
Abstract
We observe a training set Q composed of l labeled samples {(X(1).theta (1)),...(X(l),theta(l))} and u unlabeled samples {X'(1),...X'(u)}. The labels theta(i) are independent random variables satisfying Pr {theta 2--- = 1} = eta, Pr {theta(i) = 2} = 1 - eta. The labeled observation s X(2) are independently distributed with conditional density f(theta i)(.) given theta(2). Let (X(0), theta(0)) be a new sample, independen tly distributed as the samples in the training set. We observe X(0) an d we wish to infer the classification theta(0). In this paper we first assume that the distributions f(1)(.) and f(2)(.) are given and that the mixing parameter eta is unknown, We show that the relative value o f labeled and unlabeled samples in reducing the risk of optimal classi fiers is the ratio of the Fisher informations they carry about the par ameter eta. We then assume that two densities g(1)(.) and g(2)(.) are given, but we do not know whether g(1)(.) = f(1)(.) and g(2)(.) = f(2) (.) or if the opposite holds, nor do we know eta. Thus the learning pr oblem consists of both estimating the optimum partition of the observa tion space and assigning the classifications to the decision regions, Here, we show that labeled samples are necessary to construct a classi fication rule and that they are exponentially more valuable than unlab eled samples.