This paper presents the surface-based factorization method to recover three
-dimensional (3-D) structure, i.e., the 3-D shape and 3-D motion, of a rigi
d object from a two-dimensional (2-D) video sequence. The main ingredients
of our approach are as follows:
1) we describe the unknown shape of the 3-D rigid object by polynomial patc
hes;
2) projections of these patches in the image plane move according to parame
tric 2-D motion models;
3) we recover the parameters describing the 3-D shape and 3-D motion from t
he 2-D motion parameters by factorizing a matrix that is rank 1 in a noisel
ess situation.
Our method is simultaneously an extension and a simplification of the origi
nal factorization method of Tomasi and Kanade [1]. We track regions where t
he 2-D motion in the image plane is described by a single set of parameters
, avoiding the need to track a large number of pointwise features, in gener
al, a difficult task. Then our method estimates the parameters describing t
he 3-D structure by factoring a rank I matrix, not rank 3 as in [1]. This a
llows the use of fast iterative algorithms to compute the 3-D structure tha
t best fits the data. Experimental results with real-life video sequences i
llustrate the good performance of our approach.