A CONTINUOUS-TIME DYNAMIC FORMULATION OF VITERBI ALGORITHM FOR ONE-GAUSSIAN-PER-STATE HIDDEN MARKOV-MODELS

Authors
Citation
M. Saerens, A CONTINUOUS-TIME DYNAMIC FORMULATION OF VITERBI ALGORITHM FOR ONE-GAUSSIAN-PER-STATE HIDDEN MARKOV-MODELS, Speech communication, 12(4), 1993, pp. 321-333
Citations number
50
Categorie Soggetti
Communication,"Language & Linguistics
Journal title
ISSN journal
01676393
Volume
12
Issue
4
Year of publication
1993
Pages
321 - 333
Database
ISI
SICI code
0167-6393(1993)12:4<321:ACDFOV>2.0.ZU;2-V
Abstract
When using hidden Markov models for speech recognition, it is usually assumed that the probability that a particular acoustic vector is emit ted at a given time only depends on the current state and the current acoustic vector observed. In this paper, we introduce another idea, i. e., we assume that, in a given state, the acoustic vectors are generat ed by a continuous Markov process. Indeed, the time evolution of the a coustic vector is inherently dynamic and continuous, and sampling only occurs for the purpose of computation. This allows us to assign a pro bability density to the time trajectory of the acoustic vector inside the state, reflecting the probability that this particular path has be en generated by the continuous Markov process associated with this sta te. Roughly speaking, it measures the ''adequacy'' of the observed tra jectory with respect to an ideal trajectory, which is modelled by a ve ctorial linear differential equation. This model is introduced in orde r to describe the dynamic behaviour of the acoustic vector inside a st ate. Once the segmentation is fixed, reestimation formulae for the par ameters of the continuous Markov process are derived for the Viterbi a lgorithm. As usual, the segmentation can be obtained by sampling the c ontinuous process, and by applying dynamic programming to find the bes t path over all the possible sequences of states and all the possible durations. Finally, we sketch a possible generalization to path mixtur es, for which different trajectories are available in each state. Howe ver, we have to stress that no experimental results are available at p resent. Indeed, we did not have the opportunity to test the algorithm on real speech. We are aware of the fact that the assumptions we did m ay not be appropriate for the modelling of speech.