T. Ebihara et al., Speech synthesis software with a variable speaking rate and its implementation on a 32-bit microprocessor, IEEE CONS E, 46(3), 2000, pp. 887-895
This paper describes a new speech synthesis system that produces speech at
a controllable Tate. The method is based on the Oscillator Model in which o
utput speech of a desired length can be obtained without extracting pitch s
ynchronous positions. This model has been applied to a residual-excited voc
oder to improve the sound quality of synthesized speech The proposed method
is based on the duration of phonemes in natural speech Phonemes are classi
fied for the time-scale modification algorithm, and this made it possible t
o easily control the duration of phonemes various speaking rates. Sound qua
lity evaluation tests confirmed that the quality of sound produced by this
new method is better than that produced by existing methods. The method was
verified by implementing it as real-time synthesis software. The software
required 8 MIPS of CPU power and ran on a 32-bit microprocessor.