ITA
ENG

SPEECH ANALYSIS SYNTHESIS AND MODIFICATION USING AN ANALYSIS-BY-SYNTHESIS OVERLAP-ADD SINUSOIDAL MODEL/

Authors

GEORGE EB SMITH MJT

Citation

Eb. George et Mjt. Smith, SPEECH ANALYSIS SYNTHESIS AND MODIFICATION USING AN ANALYSIS-BY-SYNTHESIS OVERLAP-ADD SINUSOIDAL MODEL/, IEEE transactions on speech and audio processing, 5(5), 1997, pp. 389-406

Citations number

Categorie Soggetti

Engineering, Eletrical & Electronic",Acoustics

Journal title

IEEE transactions on speech and audio processing → ACNP

ISSN journal

10636676

Volume

Issue

Year of publication

1997

Pages

389 - 406

Database

ISI

SICI code

1063-6676(1997)5:5<389:SASAMU>2.0.ZU;2-7

Abstract

Sinusoidal modeling has been successfully applied to a broad range. of speech processing problems, and offers advantages over linear predict ive modeling and the short-time Fourier transform for speech analysis/ synthesis and modification, This paper presents a novel speech analysi s/synthesis system based on the combination of an overlap-add sinusoid al model with an analysis-by-synthesis technique to determine model pa rameters, The paper describes this analysis procedure in detail, and i ntroduces an equivalent frequency-domain algorithm that takes advantag e of the computational efficiency of the fast Fourier transform (FFT), In addition, a refined overlap-add sinusoidal model capable of shape- invariant speech modification is derived, and a pitch-scale modificati on algorithm is defined that preserves speech bandwidth and eliminates noise migration effects, Analysis-by-synthesis achieves very high syn thetic speech quality by accurately estimating component frequencies, eliminating sidelobe interference effects, and effectively dealing wit h nonstationary speech events, The refined overlap-add synthesis model correlates well with analysis-by-synthesis, and modifies speech witho ut objectionable artifacts by explicitly controlling shape invariance and phase coherence, The proposed analysis-by-synthesis/overlap-add (A BS/OLA) system allows for both fixed and time-varying time-, frequency -, and pitch-scale modifications, and computational shortcuts using th e FFT algorithm make its implementation feasible using currently avail able hardware.