Generating face models of humans from video sequences is an important probl
em in many multimedia applications ranging from teleconferencing to virtual
reality. Most practical approaches try to fit a generic face model in the
two-dimensional image, and adjust the model parameters to arrive at the fin
al answer. These approaches require the identification of specific landmark
s on the face, and this identification routine may or may not be an automat
ed process. In this paper, we present a method for deriving the three-dimen
sional (3-D) face model from a monocular image sequence, using a few standa
rd results from the affine camera geometry literature in computer vision, a
nd spline-fitting techniques adopted from the nonparametric regression lite
rature in statistics. No prior knowledge of the camera calibration paramete
rs and the shape of the face is required by the system, and the entire proc
ess requires no user intervention. The system has been successfully demonst
arted to extract the 3-D face structure of humans in several image sequence
s.