In this work, we are concerned with optimal estimation of clean speech from
its noisy version based on a speech model we propose. We first propose a (
single) speech model which satisfactorily describes voiced and unvoiced spe
ech and silence (i.e., pauses between speech utterances), and also allows f
or exploitation of the long term characteristics of noise. We then reformul
ate the model equations so as to facilitate subsequent application of the w
ell-established Kalman filter for computing the optimal estimate of the cle
an speech in the minimum-mean-square-error sense. Since the standard algori
thm for Kalman filtering involves multiplications of very large matrices an
d thus demands high computational cost, we devise a mathematically equivale
nt algorithm which is computationally much more efficient, by exploiting th
e sparsity of the matrices concerned. Next, we present the methods me use f
or estimating the model parameters and give a complete description of the e
nhancement process. Performance assessment based on spectrogram plots, obje
ctive measures and informal subjective listening tests all indicate that ou
r method gives consistently good results. As far as signal-to-noise ratio i
s concerned, the improvements over existing methods can be as high as 4 dB.