ITA
ENG

Control of spectral dynamics in concatenative speech synthesis

Authors

Wouters, J Macon, MW

Citation

J. Wouters et Mw. Macon, Control of spectral dynamics in concatenative speech synthesis, IEEE SPEECH, 9(1), 2001, pp. 30-38

Citations number

Categorie Soggetti

Eletrical & Eletronics Engineeing

Journal title

IEEE TRANSACTIONS ON SPEECH AND AUDIO PROCESSING

ISSN journal

10636676 → ACNP

Volume

Issue

Year of publication

2001

Pages

30 - 38

Database

ISI

SICI code

1063-6676(200101)9:1<30:COSDIC>2.0.ZU;2-1

Abstract

Current speech synthesis methods based on the concatenation of waveform uni ts can produce highly intelligible speech capturing the identity of a parti cular speaker. However, the quality of concatenated speech often suffers fr om discontinuities bem-een the acoustic units, due to contextual difference s and variations in speaking style across the database. In this paper, we p resent methods to spectrally modify speech units in a concatenative synthes izer to correspond more closely to the acoustic transitions observed in nat ural speech. First, a technique called "unit fusion" is proposed to reduce spectral mismatch between units. In addition to concatenation units, a seco nd, independent tier of units is selected that defines the desired spectral dynamics at concatenation points. Both unit tiers are ''fused' to obtain n atural transitions throughout the synthesized utterance. The unit fusion me thod is further extended to control the perceived degree of articulation of concatenated units. In the second part of the paper, a signal processing t echnique based on sinusoidal modeling is presented that enables high-qualit y resynthesis of units with a modified spectral shape.