ESTIMATION OF GLOTTAL WAVES BASED ON NONMINIMUM-PHASE MODELS

Authors
Citation
T. Yahagi et Y. Soeda, ESTIMATION OF GLOTTAL WAVES BASED ON NONMINIMUM-PHASE MODELS, Electronics and communications in Japan. Part 3, Fundamental electronic science, 81(11), 1998, pp. 56-66
Citations number
18
Categorie Soggetti
Engineering, Eletrical & Electronic
ISSN journal
10420967
Volume
81
Issue
11
Year of publication
1998
Pages
56 - 66
Database
ISI
SICI code
1042-0967(1998)81:11<56:EOGWBO>2.0.ZU;2-X
Abstract
Since the characteristics of the glottal sound source due to vibration of the vocal cords have a great effect on the quality of synthesized speech, there have been intensive studies on glottal waves. It is know n experimentally that the waveform is a rounded asymmetrical triangula r wave. Many voiced source models have been proposed in which the glot tal waves are parametrically represented as possible approaches toward a more natural synthesized speech. There are many unsolved problems, however, since the characteristics of the glottal source must be known for various speech utterances in order to construct the source model. Method of estimating the glottal wave from the observed speech signal include the inverse filtering method, where a filter with the inverse characteristic to the transfer function of the vocal tract is used. I n this method, however, there remains the problem of how the essential error due to the separate estimations of the vocal-tract transfer fun ction and the glottal wave can be eliminated. This paper proposes a ne w estimation algorithm for glottal waves, where the characteristics of the glottal waves and the vocal tract are estimated simultaneously by considering the vocal-tract transfer function, including the characte ristics of the glottal source. In the proposed method, the speech gene ration process is represented by a nonminimum-phase model including th e characteristics of the glottal source, and the glottal wave is estim ated by estimating the parameters of the transfer function. In the est imation of the glottal wave, the unknown driving input signal must be estimated in parallel to the estimation of the transfer function param eters. An approximate inverse system is introduced in the proposed met hod, since the inverse system for the transfer function of the nonmini mum-phase model is unstable. Using the proposed model, the glottal wav e can be directly estimated when the vocal-tract characteristic can be represented by an all-pole model. It is also possible to use the nonm inimum-phase ARMA model in this method for the analysis/synthesis of s peech that includes glottal waves. The glottal wave is estimated by si mulation as well as by observation of actual vowels, and satisfactory results are obtained, indicating the usefulness of the proposed estima tion algorithm. (C) 1998 Scripta Technica.