3D lip shapes from video: A combined physical-statistical model

Citation
S. Basu et al., 3D lip shapes from video: A combined physical-statistical model, SPEECH COMM, 26(1-2), 1998, pp. 131-148
Citations number
18
Categorie Soggetti
Computer Science & Engineering
Journal title
SPEECH COMMUNICATION
ISSN journal
01676393 → ACNP
Volume
26
Issue
1-2
Year of publication
1998
Pages
131 - 148
Database
ISI
SICI code
0167-6393(199810)26:1-2<131:3LSFVA>2.0.ZU;2-W
Abstract
Tracking human lips in video is an important but notoriously difficult task . To accurately recover their motions in 3D from any head pose is an even m ore challenging task, though still necessary for natural interactions. Our approach is to build and train 3D models of lip motion to make up for the i nformation we cannot always observe when tracking. We use physical models a s a prior and combine them with statistical models, showing how the two can be smoothly and naturally integrated into a synthesis method and a MAP est imation framework for tracking. We have found that this approach allows us to accurately and robustly track and synthesize the 3D shape of the lips fr om arbitrary head poses in a 2D video stream. We demonstrate this with nume rical results on reconstruction accuracy, examples of static fits, and audi o-visual sequences. (C) 1998 Elsevier Science B.V. All rights reserved.