ITA
ENG

UNSUPERVISED SPEAKER ADAPTATION USING ALL-PHONEME ERGODIC HIDDEN MARKOV NETWORK

Authors

MIYAZAWA Y TAKAMI J SAGAYAMA S MATSUNAGA S

Citation

Y. Miyazawa et al., UNSUPERVISED SPEAKER ADAPTATION USING ALL-PHONEME ERGODIC HIDDEN MARKOV NETWORK, IEICE transactions on information and systems, E78D(8), 1995, pp. 1044-1050

Citations number

Categorie Soggetti

Computer Science Information Systems

Journal title

IEICE transactions on information and systems → ACNP

ISSN journal

09168532

Volume

E78D

Issue

Year of publication

1995

Pages

1044 - 1050

Database

ISI

SICI code

0916-8532(1995)E78D:8<1044:USAUAE>2.0.ZU;2-M

Abstract

This paper proposes an unsupervised speaker adaptation method using an ''all-phoneme ergodic Hidden Markov Network'' that combines allophoni c (context-dependent phone) acoustic models with stochastic language c onstraints. Hidden Markov Network (HMnet) for allophone modeling and a llophonic bigram probabilities derived from a large text database are combined to yield a single large ergodic HMM which represents arbitrar y speech signals in a particular language so that the model parameters can be re-estimated using text-unknown speech samples with the Baum-W elch algorithm. When combined with the Vector Field Smoothing (VFS) te chnique, unsupervised speaker adaptation can be effectively performed. This method experimentally gave better performances compared with our previous unsupervised adaptation method which used conventional phone tic HMMs and phoneme bigram probabilities especially when the amount o f training data was small.