Noisy speech recognition using de-noised multiresolution analysis acousticfeatures

Citation
Cp. Chan et al., Noisy speech recognition using de-noised multiresolution analysis acousticfeatures, J ACOUST SO, 110(5), 2001, pp. 2567-2574
Citations number
34
Categorie Soggetti
Multidisciplinary,"Optics & Acoustics
Journal title
JOURNAL OF THE ACOUSTICAL SOCIETY OF AMERICA
ISSN journal
00014966 → ACNP
Volume
110
Issue
5
Year of publication
2001
Part
1
Pages
2567 - 2574
Database
ISI
SICI code
0001-4966(200111)110:5<2567:NSRUDM>2.0.ZU;2-G
Abstract
This paper describes a novel application of multiresolution analysis (MRA) in extracting acoustic features that possess de-noising capability for robu st speech recognition. The MRA algorithm is used to construct a mel-scaled wavelet packet filter-bank, from which subband powers are computed as the f eature parameters for speech recognition. Wiener filtering is applied to a few selected subbands at some intermediate stages of decomposition. For hig h-frequency bands, Wiener filters are designed based on a reduced fraction of the estimated noise power, making the consonant features much more promi nent and contrastive. The proposed method is evaluated in phone recognition experiments with the MIT database. In the presence of stationary white noi se at 10-dB SNR, the de-noised MRA features attain a phone recognition rate of 32%. There is a noticeable improvement compared with the accuracy of 29 % and 20% attained by the commonly used mel-frequency cepstral coefficients (MFCC) with and without cepstral mean normalization (CMN), respectively. T he effectiveness of the MRA features is also verified by the fact that they exhibit smaller distortion from clean speech. (C) 2001 Acoustical Society of America.