ITA
ENG

A maximum A posteriori approach to speaker adaptation using the trended hidden Markov model

Authors

Chengalvarayan, R Deng, L

Citation

R. Chengalvarayan et L. Deng, A maximum A posteriori approach to speaker adaptation using the trended hidden Markov model, IEEE SPEECH, 9(5), 2001, pp. 549-557

Citations number

Categorie Soggetti

Eletrical & Eletronics Engineeing

Journal title

IEEE TRANSACTIONS ON SPEECH AND AUDIO PROCESSING

ISSN journal

10636676 → ACNP

Volume

Issue

Year of publication

2001

Pages

549 - 557

Database

ISI

SICI code

1063-6676(200107)9:5<549:AMAPAT>2.0.ZU;2-F

Abstract

A formulation of the maximum a posteriori (MAP) approach to speaker adaptat ion is presented with use of the trended or nonstationary-state hidden Mark ov model (HMM), where the Gaussian means in each HMM state are characterize d by time-varying polynomial trend functions of the state sojourn time. Ass uming uncorrelatedness among the polynomial coefficients in the trend funct ions, we have obtained analytical results for the MAP estimates of the para meters including time-varying means and time-invariant precisions. We have implemented a speech recognizer based on these results in speaker adaptatio n experiments using the TI46 corpora, The experimental evaluation demonstra tes that the trended HMM, with use of either the linear or the quadratic po lynomial trend function, consistently outperforms the conventional, station ary-state HMM, The evaluation also shows that the unadapted, speaker-indepe ndent models are outperformed by the models adapted by the MAP procedure un der supervision with as few as a single adaptation token. Further, adaptati on of polynomial coefficients alone is shown to be better than adapting bot h polynomial coefficients and precision matrices when fewer than four adapt ation tokens are used, while the reverse is found with a greater number of adaptation tokens.