P. Eisert et al., Model-aided coding: A new approach to incorporate facial animation into motion-compensated video coding, IEEE CIR SV, 10(3), 2000, pp. 344-358
Citations number
39
Categorie Soggetti
Eletrical & Eletronics Engineeing
Journal title
IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS FOR VIDEO TECHNOLOGY
We show that traditional waveform coding and 3-D model-based coding are not
competing alternatives, but should be combined to support and complement e
ach other. Both approaches are combined such that the generality of wavefor
m coding and the efficiency of 3-D model-based coding are available where n
eeded. The combination is achieved by providing the block-based video coder
with a second reference frame for prediction, which is synthesized by the
model-based coder. The model-based coder uses a parameterized 3-D head mode
l, specifying shape and color of a person. We therefore restrict our invest
igations to typical videotelephony scenarios that show head-and-shoulder sc
enes. Motion and deformation of the 3-D head model constitute facial expres
sions which are represented by facial animation parameters (FAP's) based on
the MPEG-4 standard. An intensity gradient based approach that exploits th
e 3-D model information is used to estimate the FAP's, as well as illuminat
ion parameters, that describe changes of the brightness in the scene. Model
failures and objects that are not known at the decoder are handled by stan
dard block-based motion-compensated prediction, which is not restricted to
a special scene content, but results in lower coding efficiency. A Lagrangi
an approach is employed to determine the most efficient prediction for each
block from either the synthesized model frame or the previous decoded fram
e. Experiments on five video sequences show that bit-rate savings of about
35% are achieved at equal average peak signal-to-noise ratio (PSNR) when co
mparing the model-aided codec to TMN-10, the state-of-the-art test model of
the H.263 standard. This corresponds to a gain of 2-3 dB in PSNR when enco
ding at the same average bit rate.