Model-aided coding: A new approach to incorporate facial animation into motion-compensated video coding

Citation
P. Eisert et al., Model-aided coding: A new approach to incorporate facial animation into motion-compensated video coding, IEEE CIR SV, 10(3), 2000, pp. 344-358
Citations number
39
Categorie Soggetti
Eletrical & Eletronics Engineeing
Journal title
IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS FOR VIDEO TECHNOLOGY
ISSN journal
10518215 → ACNP
Volume
10
Issue
3
Year of publication
2000
Pages
344 - 358
Database
ISI
SICI code
1051-8215(200004)10:3<344:MCANAT>2.0.ZU;2-8
Abstract
We show that traditional waveform coding and 3-D model-based coding are not competing alternatives, but should be combined to support and complement e ach other. Both approaches are combined such that the generality of wavefor m coding and the efficiency of 3-D model-based coding are available where n eeded. The combination is achieved by providing the block-based video coder with a second reference frame for prediction, which is synthesized by the model-based coder. The model-based coder uses a parameterized 3-D head mode l, specifying shape and color of a person. We therefore restrict our invest igations to typical videotelephony scenarios that show head-and-shoulder sc enes. Motion and deformation of the 3-D head model constitute facial expres sions which are represented by facial animation parameters (FAP's) based on the MPEG-4 standard. An intensity gradient based approach that exploits th e 3-D model information is used to estimate the FAP's, as well as illuminat ion parameters, that describe changes of the brightness in the scene. Model failures and objects that are not known at the decoder are handled by stan dard block-based motion-compensated prediction, which is not restricted to a special scene content, but results in lower coding efficiency. A Lagrangi an approach is employed to determine the most efficient prediction for each block from either the synthesized model frame or the previous decoded fram e. Experiments on five video sequences show that bit-rate savings of about 35% are achieved at equal average peak signal-to-noise ratio (PSNR) when co mparing the model-aided codec to TMN-10, the state-of-the-art test model of the H.263 standard. This corresponds to a gain of 2-3 dB in PSNR when enco ding at the same average bit rate.