ITA
ENG

STOCHASTIC TRAJECTORY MODELING AND SENTENCE SEARCHING FOR CONTINUOUS SPEECH RECOGNITION

Authors

GONG YF

Citation

Yf. Gong, STOCHASTIC TRAJECTORY MODELING AND SENTENCE SEARCHING FOR CONTINUOUS SPEECH RECOGNITION, IEEE transactions on speech and audio processing, 5(1), 1997, pp. 33-44

Citations number

Categorie Soggetti

Engineering, Eletrical & Electronic",Acoustics

Journal title

IEEE transactions on speech and audio processing → ACNP

ISSN journal

10636676

Volume

Issue

Year of publication

1997

Pages

33 - 44

Database

ISI

SICI code

1063-6676(1997)5:1<33:STMASS>2.0.ZU;2-Y

Abstract

The paper first points out a defect in hidden Markov modeling (HMM) of continuous speech, referred as trajectory folding phenomenon. A new a pproach to modeling phoneme-based speech units is then proposed, which represents the acoustic observations of a phoneme as clusters of traj ectories in a parameter space. The trajectories are modeled by mixture of probability density functions of random sequence of states. Each s tate is associated with a multivariate Gaussian density function, opti mized at state sequence level. Conditional trajectory duration probabi lity is integrated in the modeling, An efficient sentence search proce dure based on trajectory modeling is also formulated, Experiments with a speaker-dependent, 2010-word continuous speech recognition applicat ion with a word-pair perplexity of 50, using vocabulary-independent ac oustic training, monophone models trained with 80 sentences per speake r, reported about 1% word error rate. The new models were experimental ly compared to continuous density mixture HMM (CDHMM) on a same recogn ition task, and gave significantly smaller word error rates. These res ults suggest that the stochastic trajectory model provides a more in-d epth modeling of continuous speech signals.