Segmental modeling using a continuous mixture of nonparametric models

Citation
J. Goldberger et al., Segmental modeling using a continuous mixture of nonparametric models, IEEE SPEECH, 7(3), 1999, pp. 262-271
Citations number
26
Categorie Soggetti
Eletrical & Eletronics Engineeing
Journal title
IEEE TRANSACTIONS ON SPEECH AND AUDIO PROCESSING
ISSN journal
10636676 → ACNP
Volume
7
Issue
3
Year of publication
1999
Pages
262 - 271
Database
ISI
SICI code
1063-6676(199905)7:3<262:SMUACM>2.0.ZU;2-A
Abstract
A major limitation of hidden Markov model (HMM) based automatic speech reco gnition is the inherent assumption that successive observations within a st ate are independent and identically distributed (IID), The IID assumption i s reasonable for some of the states (e.g., a state that corresponds to a st eady state vowel), However, most states clearly violate this assumption (e. g., states corresponding to vowel-consonant transition, diphthongs, etc.) a nd are in fact characterized by a highly correlated and nonstationary speec h signal. In recent years, alternative models have been proposed, that atte mpt to describe the dynamics of the signal within a phonetic unit. The new approach is generally known by the name segmental modeling, since the speec h signal is modeled on a segment level base and not on a frame base (such a s HMM). We propose a family of new segmental models that are composed of tw o elements, The first element is a nonparametric representation of the mean and variance trajectories, and the second is some parameterized transforma tion (e.g., random shift),of the trajectory that is global to the entire se gment, The new model is in fact a continuous mixture of segment trajectorie s, We present recognition results on a large vocabulary task, and compare t he model to alternative segment models on a triphone recognition task.