Av. Mccree et Tp. Barnwell, A MIXED EXCITATION LPC VOCODER MODEL FOR LOW BIT-RATE SPEECH CODING, IEEE transactions on speech and audio processing, 3(4), 1995, pp. 242-250
Traditional pitch-excited linear predictive coding (LPC) vocoders use
a fully parametric model to efficiently encode the important informati
on in human speech. These vocoders can produce intelligible speech at
low data rates (800-2400 b/s), but they often sound synthetic and gene
rate annoying artifacts such as buzzes, thumps, and tonal noises. Thes
e problems increase dramatically if acoustic background noise is prese
nt at the speech input. This paper presents a new mixed excitation LPC
vocoder model that preserves the low bit rate of a fully parametric m
odel but adds more free parameters to the excitation signal so that th
e synthesizer can mimic more characteristics of natural human speech.
The new model also eliminates the traditional requirement for a binary
voicing decision so that the vocoder performs well even in the presen
ce of acoustic background noise. A 2400-b/s LPC vocoder based on this
model has been developed and implemented in simulations and in a real-
time system. Formal subjective testing of this coder confirms that it
produces natural sounding speech even in a difficult noise environment
. In fact, diagnostic acceptibility measure (DAM) test scores show tha
t the performance of the 2400-b/s mixed excitation LPC vocoder is clos
e to that of the government standard 4800-b/s CELP coder.