J. Deveth et L. Boves, CHANNEL NORMALIZATION TECHNIQUES FOR AUTOMATIC SPEECH RECOGNITION OVER THE TELEPHONE, Speech communication, 25(1-3), 1998, pp. 149-164
In this paper we aim to identify the underlying causes that cart expla
in the performance of different channel normalization techniques. To t
his aim we compared four different channel normalization techniques wi
thin the context of connected digit recognition over telephone lines:
cepstrum m-an subtraction, the dynamic cepstrum representation, RASTA
filtering and phase-corrected RASTA. We used context-dependent and con
text-independent hidden Markov models that were trained using a wide r
ange of different model complexities. The results of our recognition e
xperiments indicate that each channel normalization technique should p
reserve the modulation frequencies in the range between 2 and 16 Hz in
the spectrum of the speech signals. At the same time, DC components i
n the modulation spectrum should be effectively removed. With context-
independent models the channel normalization filter should have a flat
phase response. Finally, for our connected digit recognition task it
appeared that cepstrum mean subtraction and phase-corrected RASTA perf
ormed equally well for context-dependent and context-independent model
s when equal amounts of model parameters were used. (C) 1998 Elsevier
Science B.V. All rights reserved.