In this paper, the problem of extracting and grouping image features from c
omplex scenes is solved by a hierarchical approach based on two main proces
ses: voting and clustering. Voting is performed for assigning a score to bo
th global and local features. The score represents the evidential support p
rovided by input data for the presence of a feature. Clustering aims at ind
ividuating a minimal set of significant local features by grouping together
simpler correlated observations, It is based on a spatial relation between
simple observations on a fixed level, i.e., the definition of a distance i
n an appropriate space. As the multilevel structure of the system implies t
hat input data for an intermediate level are outputs of the lower level, vo
ting can be seen as a functional representation of the "part-of" relation b
etween features at different abstraction levels. The proposed approach has
been tested on both synthetic and real images and compared with other exist
ing feature grouping methods.