In large-vocabulary speech recognition, we often encounter triphones t
hat are not covered in the training data. These unseen triphones are u
sually backed off to their corresponding diphones or context-independe
nt phones, which contain less context yet have plenty of training exam
ples. In this paper, we propose to use decision-tree-based senones to
generate needed senonic baseforms for these unseen triphones. A decisi
on tree is built for each Markov state of each base phone; the leaves
of the trees constitute the senone pool. To find the senone associated
with a Markov state of any triphone, the corresponding tree is traver
sed until a leaf node is reached. The effectiveness of the proposed ap
proach was demonstrated in the ARPA 5000-word speaker-independent Wall
Street Journal dictation task The word error rate was reduced by 11%
when unseen triphones were modeled by the decision-tree-based senones
instead of context-independent phones. When there were more than five
unseen triphones in each test utterance, the error rate reduction was
more than 20%.