Theories of visual object recognition must solve the problem of recogn
izing 3D objects given that perceivers only receive 2D patterns of lig
ht on their retinae. Recent findings from human psychophysics, neuroph
ysiology and machine vision provide converging evidence for 'image-bas
ed' models in which objects are represented as collections of viewpoin
t-specific local features. This approach is contrasted with 'structura
l-description' models in which objects are represented as configuratio
ns of 3D volumes or parts. We then review recent behavioral results th
at address the biological plausibility of both approaches, as well as
some of their computational advantages and limitations. We conclude th
at, although the image-based approach holds great promise, it has pote
ntial pitfalls that may be best overcome by including structural infor
mation. Thus, the most viable model of object recognition may be one t
hat incorporates the most appealing aspects of both image-based and st
ructural-description theories. (C) 1998 Elsevier Science B.V. All righ
ts reserved.