ITA
ENG

Maximum entropy language modeling and the smoothing problem

Authors

Martin, SC Ney, H Hamacher, C

Citation

Sc. Martin et al., Maximum entropy language modeling and the smoothing problem, IEEE SPEECH, 8(5), 2000, pp. 626-632

Citations number

Categorie Soggetti

Eletrical & Eletronics Engineeing

Journal title

IEEE TRANSACTIONS ON SPEECH AND AUDIO PROCESSING

ISSN journal

10636676 → ACNP

Volume

Issue

Year of publication

2000

Pages

626 - 632

Database

ISI

SICI code

1063-6676(200009)8:5<626:MELMAT>2.0.ZU;2-X

Abstract

This paper discusses various aspects of smoothing techniques in maximum ent ropy language modeling. This topic is typically not addressed in literature . The results can be summarized in four statements: 1) Straightforward maxi mum entropy models with nested features, e.g., tri-, bi-, and uni-grams, re sult in unsmoothed relative frequencies models. 2) Maximum entropy models w ith nested features and discounted feature counts approximate backing-off s moothed relative frequencies models with Kneser's advanced marginal back-of f distribution. This explains some of the reported success of maximum entro py models in the past. 3) We give perplexity results for nested and nonnest ed features, e.g., trigrams and distance-trigrams, on a 4 million word subs et of the Wall Street Journal Corpus. From these results we conclude that t he smoothing method has more effect on the perplexity than the method of ho w to combine the different types of features. 4) We show perplexity results . for nonnested features using log-linear interpolation of conventionally s moothed language models, giving evidence that this approach may be a first step to overcome the smoothing problem in the context of maximum entropy.