In this paper, we address the problem of the recovery of a realistic t
extured model of a scene from a sequence of images, without any prior
knowledge either about the parameters of the cameras or about their mo
tion. We do not require any knowledge of the absolute coordinates of s
ome control points in the scene to achieve this goal, First, using var
ious computer vision tools, we establish correspondences between the i
mages and recover the epipolar geometry, from which we show how to com
pute the complete set of perspective projection matrices for all camer
a positions. Then, we proceed to reconstruct the geometry of the scene
. We show how to rely on information of the scene such as parallel lin
es or known angles in order to reconstruct the geometry of the scene u
p to, respectively, an unknown affine transformation or an unknown sim
ilitude. Alternatively, if this information is not available, we can s
till recover the Euclidean structure of the scene through the techniqu
es of self-calibration. The scene geometry is modeled as a set of poly
hedra Textures to be mapped on the scene polygons are extracted automa
tically from the images. We show how several images can be combined th
rough mosaicing in order to automatically remove visual artifacts such
as pedestrians or trees from the textures. This vision system has bee
n implemented as a vision server, which provides to a CAD-CAM modeler
geometry or texture information extracted from the set of images. The
whole system allows efficient and fast production of scene models of h
igh quality for such applications as simulation, virtual, or augmented
reality. (C) 1998 Academic Press.