ITA
ENG

ON THE ESTIMATION OF SMALL PROBABILITIES BY LEAVING-ONE-OUT

Authors

NEY H ESSEN U KNESER R

Citation

H. Ney et al., ON THE ESTIMATION OF SMALL PROBABILITIES BY LEAVING-ONE-OUT, IEEE transactions on pattern analysis and machine intelligence, 17(12), 1995, pp. 1202-1212

Citations number

Categorie Soggetti

Computer Sciences","Computer Science Artificial Intelligence","Engineering, Eletrical & Electronic

Journal title

IEEE transactions on pattern analysis and machine intelligence → ACNP

ISSN journal

01628828

Volume

Issue

Year of publication

1995

Pages

1202 - 1212

Database

ISI

SICI code

0162-8828(1995)17:12<1202:OTEOSP>2.0.ZU;2-X

Abstract

In this paper, we apply the leaving-one-out concept to the estimation of 'small' probabilities, i.e., the case where the number of training samples is much smaller than the number of possible classes. After der iving the Turing-Good formula in this framework, we introduce several specific models in order to avoid the problems of the original Turing- Good formula These models are the constrained model, the absolute disc ounting model and the linear discounting model. These models are then applied to the problem of bigram-based stochastic language modeling. E xperimental results are presented for a German and an English corpus.