ON THE USE OF A HYBRID HARMONIC STOCHASTIC MODEL FOR TTS SYNTHESIS-BY-CONCATENATION/

Citation
T. Dutoit et B. Gosselin, ON THE USE OF A HYBRID HARMONIC STOCHASTIC MODEL FOR TTS SYNTHESIS-BY-CONCATENATION/, Speech communication, 19(2), 1996, pp. 119-143
Citations number
30
Categorie Soggetti
Communication,"Language & Linguistics
Journal title
ISSN journal
01676393
Volume
19
Issue
2
Year of publication
1996
Pages
119 - 143
Database
ISI
SICI code
0167-6393(1996)19:2<119:OTUOAH>2.0.ZU;2-Z
Abstract
In this paper, we address the possibilities offered by hybrid harmonic /stochastic (H/S) models in the context of wide-band text-to-speech sy nthesis based on segment concatenation. After a brief recall of the hy potheses underlying such models and a comprehensive review of a well-k nown analysis algorithm, namely the one provided by the multi-band exc ited (MBE) analysis framework, we study how H/S models allow to modify the prosody of segments and how segment concatenation can be organize d, in the purpose of minimizing mismatches at the border of segments. In this context, we introduce an original concatenation algorithm whic h takes advantage of some analysis errors. Speech synthesis algorithms are then described, including an original synthesis technique based o n judiciously prepared IFFTs, and the final segmental quality(1) is de tailed. More particularly, we examine the differences in the quality o btained when using the model in a narrow-band speech coding context an d in a wide-band, concatenation based synthesis context. We study thre e possible causes for these differences: the choice of an analysis cri terion, the inadequacy of the model due to pitch variatons, and the ef fect of coarticulation on phases.