ITA
ENG

PREDICTING UNSEEN TRIPHONES WITH SENONES

Authors

HWANG MY HUANG XD ALLEVA FA

Citation

My. Hwang et al., PREDICTING UNSEEN TRIPHONES WITH SENONES, IEEE transactions on speech and audio processing, 4(6), 1996, pp. 412-419

Citations number

Categorie Soggetti

Engineering, Eletrical & Electronic",Acoustics

Journal title

IEEE transactions on speech and audio processing → ACNP

ISSN journal

10636676

Volume

Issue

Year of publication

1996

Pages

412 - 419

Database

ISI

SICI code

1063-6676(1996)4:6<412:PUTWS>2.0.ZU;2-O

Abstract

In large-vocabulary speech recognition, we often encounter triphones t hat are not covered in the training data. These unseen triphones are u sually backed off to their corresponding diphones or context-independe nt phones, which contain less context yet have plenty of training exam ples. In this paper, we propose to use decision-tree-based senones to generate needed senonic baseforms for these unseen triphones. A decisi on tree is built for each Markov state of each base phone; the leaves of the trees constitute the senone pool. To find the senone associated with a Markov state of any triphone, the corresponding tree is traver sed until a leaf node is reached. The effectiveness of the proposed ap proach was demonstrated in the ARPA 5000-word speaker-independent Wall Street Journal dictation task The word error rate was reduced by 11% when unseen triphones were modeled by the decision-tree-based senones instead of context-independent phones. When there were more than five unseen triphones in each test utterance, the error rate reduction was more than 20%.