This paper examines the inherent difficulties in observing 3D rigid motion
from image sequences. It does so without considering a particular estimator
. Instead, it presents a statistical analysis of all the possible computati
onal models which can be used for estimating 3D motion from an image sequen
ce. These computational models are classified according to the mathematical
constraints that they employ and the characteristics of the imaging sensor
(restricted field of view and full field of view). Regarding the mathemati
cal constraints, there exist two principles relating a sequence of images t
aken by a moving camera. One is the "epipolar constraint," applied to motio
n fields, and the other the "positive depth" constraint, applied to normal
flow fields. 3D motion estimation amounts to optimizing these constraints o
ver the image. A statistical modeling of these constraints leads to functio
ns which are studied with regard to their topographic structure, specifical
ly as regards the errors in the 3D motion parameters at the places represen
ting the minima of the functions. For conventional video cameras possessing
a restricted field of view, the analysis shows that for algorithms in both
classes which estimate all motion parameters simultaneously, the obtained
solution has an error such that the projections of the translational and ro
tational errors on the image plane are perpendicular to each other. Further
more, the estimated projection of the translation on the image lies on a li
ne through the origin and the projection of the real translation. The situa
tion is different for a camera with a full (360 degree) field of view (achi
eved by a panoramic sensor or by a system of conventional cameras). In this
case, at the locations of the minima of the above two functions, either th
e translational or the rotational error becomes zero, while in the case of
a restricted held of view both errors are non-zero. Although some ambiguiti
es still remain in the full field of view case, the implication is that vis
ual navigation tasks, such as visual servoing, involving 3D motion estimati
on are easier to solve by employing panoramic vision. Also, the analysis ma
kes it possible to compare properties of algorithms that first estimate the
translation and on the basis of the translational result estimate the rota
tion, algorithms that do the opposite, and algorithms that estimate all mot
ion parameters simultaneously, thus providing a sound framework for the obs
ervability of 3D motion. Finally, the introduced framework points to new av
enues for studying the stability of image-based servoing schemes.