ON THE ESTIMATION OF SMALL PROBABILITIES BY LEAVING-ONE-OUT

Citation
H. Ney et al., ON THE ESTIMATION OF SMALL PROBABILITIES BY LEAVING-ONE-OUT, IEEE transactions on pattern analysis and machine intelligence, 17(12), 1995, pp. 1202-1212
Citations number
16
Categorie Soggetti
Computer Sciences","Computer Science Artificial Intelligence","Engineering, Eletrical & Electronic
ISSN journal
01628828
Volume
17
Issue
12
Year of publication
1995
Pages
1202 - 1212
Database
ISI
SICI code
0162-8828(1995)17:12<1202:OTEOSP>2.0.ZU;2-X
Abstract
In this paper, we apply the leaving-one-out concept to the estimation of 'small' probabilities, i.e., the case where the number of training samples is much smaller than the number of possible classes. After der iving the Turing-Good formula in this framework, we introduce several specific models in order to avoid the problems of the original Turing- Good formula These models are the constrained model, the absolute disc ounting model and the linear discounting model. These models are then applied to the problem of bigram-based stochastic language modeling. E xperimental results are presented for a German and an English corpus.