A MODEL OF DYNAMIC AUDITORY-PERCEPTION AND ITS APPLICATION TO ROBUST WORD RECOGNITION

Authors
Citation
B. Strope et A. Alwan, A MODEL OF DYNAMIC AUDITORY-PERCEPTION AND ITS APPLICATION TO ROBUST WORD RECOGNITION, IEEE transactions on speech and audio processing, 5(5), 1997, pp. 451-464
Citations number
36
Categorie Soggetti
Engineering, Eletrical & Electronic",Acoustics
ISSN journal
10636676
Volume
5
Issue
5
Year of publication
1997
Pages
451 - 464
Database
ISI
SICI code
1063-6676(1997)5:5<451:AMODAA>2.0.ZU;2-Z
Abstract
This paper describes two mechanisms that augment the common automatic speech recognition (ASR) front end and provide adaptation and isolatio n of local spectral peaks, A dynamic model consisting of a linear filt erbank with a novel additive logarithmic adaptation stage after each f ilter output is proposed, An extensive series of perceptual forward ma sking experiments, together with previously reported forward masking d ata, determine the model's dynamic parameters, Once parameterized, the simple exponential dynamic mechanism predicts the nature of forward m asking data from several studies across wide ranging frequencies, inpu t levels, and probe delay times, An initial evaluation of the dynamic model together with a local peak isolation mechanism as a front end fo r dynamic time warp (DTW) and hidden Markov model (HMM) word recogniti on systems shows an improvement in robustness to background noise when compared to Mel-frequency cepstral coefficients (MFCC), linear predic tion cepstral coefficients (LPCC), and relative spectra (RASTA) based front ends.