CHANNEL NORMALIZATION TECHNIQUES FOR AUTOMATIC SPEECH RECOGNITION OVER THE TELEPHONE

Authors
Citation
J. Deveth et L. Boves, CHANNEL NORMALIZATION TECHNIQUES FOR AUTOMATIC SPEECH RECOGNITION OVER THE TELEPHONE, Speech communication, 25(1-3), 1998, pp. 149-164
Citations number
29
Categorie Soggetti
Communication,"Computer Science Interdisciplinary Applications","Computer Science Interdisciplinary Applications",Acoustics
Journal title
ISSN journal
01676393
Volume
25
Issue
1-3
Year of publication
1998
Pages
149 - 164
Database
ISI
SICI code
0167-6393(1998)25:1-3<149:CNTFAS>2.0.ZU;2-A
Abstract
In this paper we aim to identify the underlying causes that cart expla in the performance of different channel normalization techniques. To t his aim we compared four different channel normalization techniques wi thin the context of connected digit recognition over telephone lines: cepstrum m-an subtraction, the dynamic cepstrum representation, RASTA filtering and phase-corrected RASTA. We used context-dependent and con text-independent hidden Markov models that were trained using a wide r ange of different model complexities. The results of our recognition e xperiments indicate that each channel normalization technique should p reserve the modulation frequencies in the range between 2 and 16 Hz in the spectrum of the speech signals. At the same time, DC components i n the modulation spectrum should be effectively removed. With context- independent models the channel normalization filter should have a flat phase response. Finally, for our connected digit recognition task it appeared that cepstrum mean subtraction and phase-corrected RASTA perf ormed equally well for context-dependent and context-independent model s when equal amounts of model parameters were used. (C) 1998 Elsevier Science B.V. All rights reserved.