A Bayesian approach for building triphone models for continuous speech recognition

Citation
J. Ming et al., A Bayesian approach for building triphone models for continuous speech recognition, IEEE SPEECH, 7(6), 1999, pp. 678-684
Citations number
26
Categorie Soggetti
Eletrical & Eletronics Engineeing
Journal title
IEEE TRANSACTIONS ON SPEECH AND AUDIO PROCESSING
ISSN journal
10636676 → ACNP
Volume
7
Issue
6
Year of publication
1999
Pages
678 - 684
Database
ISI
SICI code
1063-6676(199911)7:6<678:ABAFBT>2.0.ZU;2-V
Abstract
This paper introduces a new statistical framework for constructing triphoni c models from models of less context-dependency. This composition reduces t he number of models to be estimated by higher than an order of magnitude an d is therefore of great significance in relieving the data sparsity problem in triphone-based continuous speech recognition. The new framework is deri ved from Bayesian statistics, and represents an alternative to other tripho ne-by-composition techniques, particularly to the model-interpolation and q uasitriphone approaches. The potential power of this new framework is explo red by an implementation based on the hidden Markov modeling technique. It is shown that the new model structure includes the quasitriphone model as a special case, and leads to more efficient parameter estimation than the mo del-interpolation method, Phone recognition experiments show an increase in the accuracy over that obtained by comparable models.