PREDICTING UNSEEN TRIPHONES WITH SENONES

Citation
My. Hwang et al., PREDICTING UNSEEN TRIPHONES WITH SENONES, IEEE transactions on speech and audio processing, 4(6), 1996, pp. 412-419
Citations number
25
Categorie Soggetti
Engineering, Eletrical & Electronic",Acoustics
ISSN journal
10636676
Volume
4
Issue
6
Year of publication
1996
Pages
412 - 419
Database
ISI
SICI code
1063-6676(1996)4:6<412:PUTWS>2.0.ZU;2-O
Abstract
In large-vocabulary speech recognition, we often encounter triphones t hat are not covered in the training data. These unseen triphones are u sually backed off to their corresponding diphones or context-independe nt phones, which contain less context yet have plenty of training exam ples. In this paper, we propose to use decision-tree-based senones to generate needed senonic baseforms for these unseen triphones. A decisi on tree is built for each Markov state of each base phone; the leaves of the trees constitute the senone pool. To find the senone associated with a Markov state of any triphone, the corresponding tree is traver sed until a leaf node is reached. The effectiveness of the proposed ap proach was demonstrated in the ARPA 5000-word speaker-independent Wall Street Journal dictation task The word error rate was reduced by 11% when unseen triphones were modeled by the decision-tree-based senones instead of context-independent phones. When there were more than five unseen triphones in each test utterance, the error rate reduction was more than 20%.