H. Ney et al., ON THE ESTIMATION OF SMALL PROBABILITIES BY LEAVING-ONE-OUT, IEEE transactions on pattern analysis and machine intelligence, 17(12), 1995, pp. 1202-1212
In this paper, we apply the leaving-one-out concept to the estimation
of 'small' probabilities, i.e., the case where the number of training
samples is much smaller than the number of possible classes. After der
iving the Turing-Good formula in this framework, we introduce several
specific models in order to avoid the problems of the original Turing-
Good formula These models are the constrained model, the absolute disc
ounting model and the linear discounting model. These models are then
applied to the problem of bigram-based stochastic language modeling. E
xperimental results are presented for a German and an English corpus.