The notion that visual attention can operate over visual objects in additio
n to spatial locations has recently received much empirical support, but th
ere has been relatively little empirical consideration of what can count as
an 'object' in the first place. We have investigated this question in the
context of the multiple object tracking paradigm, in which subjects must tr
ack a number of independently and unpredictably moving identical items in a
field of identical distracters. What types of feature clusters can be trac
ked in this manner? In other words, what counts as an 'object' in this task
? We investigated this question with a technique we call target merging: we
alter tracking displays so that distinct target and distracter locations a
ppear perceptually to be parts of the same object by merging pairs of items
(one target with one distracter) in various ways - for example, by connect
ing item locations with a simple line segment, by drawing the convex hull o
f the two items. and so forth. The data show that target merging makes the
tracking task far more difficult to varying degrees depending on exactly ho
w the items are merged. The effect is perceptually salient, involving in so
me conditions a total destruction of subjects' capacity to track multiple i
tems. These studies provide strong evidence for the object-based nature of
tracking, confirming that in some contexts attention must be allocated to o
bjects rather than arbitrary collections of features. In addition, the resu
lts begin to reveal the types of spatially organized scene components that
can be independently attended as a function of properties such as connected
ness, part structure, and other types of perceptual grouping. (C) 2001 Else
vier Science B.V. All rights reserved.