A. Sankar et Ch. Lee, MAXIMUM-LIKELIHOOD APPROACH TO STOCHASTIC MATCHING FOR ROBUST SPEECH RECOGNITION, IEEE transactions on speech and audio processing, 4(3), 1996, pp. 190-202
We present a maximum-likelihood (ML) stochastic matching approach to d
ecrease the acoustic mismatch between a test utterance and a given set
of speech models so as to reduce the recognition performance degradat
ion caused by distortions in the test utterance and/or the model set.
We assume that the speech signal is modeled by a set of subword hidden
Markov models (HMM) Lambda(x). The mismatch between the observed test
utterance Y and the models is can be reduced in two ways: 1) by an in
verse distortion function F-nu(.) that maps Y into an utterance X that
matches better with the models Lambda(x) and 2) by a model transforma
tion function G(eta)(.) that maps Lambda(x) to the transformed model L
ambda(y) that matches better with the utterance Y. We assume the funct
ional form of the transformations F-nu(.) or G(eta)(.) and estimate th
e parameters mu or eta in a ML manner using the expectation-maximizati
on (EM) algorithm. The choice of the form of F-nu(.) or G(eta)(.) is b
ased on our prior knowledge of the nature of the acoustic mismatch. Th
e stochastic matching algorithm operates only on the given test uttera
nce and the given set of speech models, and no additional training dat
a Is required for the estimation of the mismatch prior to actual testi
ng. Experimental results are presented to study the properties of the
proposed algorithm and to verify the efficacy of the approach in impro
ving the performance of a HMM-based continuous speech recognition syst
em in the presence of mismatch due to different transducers and transm
ission channels. The proposed stochastic matching algorithm is found t
o converge fast. Further, the recognition performance in mismatched co
nditions is greatly improved, while the performance in matched conditi
ons is well maintained. The stochastic matching algorithm was able to
reduce the word error rate by about 70% in mismatched conditions.