The pattern of retinal binocular disparities acquired by a fixating vi
sual system depends on both the depth structure of the scene and the v
iewing geometry. This paper treats the problem of interpreting the dis
parity pattern in terms of scene structure without relying on estimate
s of fixation position from eye movement control and proprioception me
chanisms. We propose a sequential decomposition of this interpretation
process into disparity correction, which is used to compute three-dim
ensional structure up to a relief transformation, and disparity normal
ization, which is used to resolve the relief ambiguity to obtain metri
c structure. We point out that the disparity normalization stage can o
ften be omitted, since relief transformations preserve important prope
rties such as depth ordering and coplanarity. Based on this framework
we analyse three previously proposed computational models of disparity
processing; the Mayhew and Longuet-Higgins model, the deformation mod
el and the polar angle disparity model. We show how these models are r
elated, and argue that none of them can account satisfactorily for ava
ilable psychophysical data. We therefore propose an alternative model,
regional disparity correction. Using this model we derive predictions
for a number of experiments based on vertical disparity manipulations
, and compare them to available experimental data. The paper is conclu
ded with a summary and a discussion of the possible architectures and
mechanisms underling stereopsis in the human visual system.