In this paper, a systematic optimization approach for clustering proximity
or similarity data is developed. Starting from Fundamental invariance and r
obustness properties, a set of axioms is proposed and discussed to distingu
ish different cluster compactness and separation criteria. The approach cov
ers the case of sparse proximity matrices, and is extended to nested partit
ionings for hierarchical data clustering. To solve the associated optimizat
ion problems, a rigorous mathematical framework for deterministic annealing
and mean-field approximation is presented. Efficient optimization heuristi
cs are derived in a canonical way, which also clarifies the relation to sto
chastic optimization by Gibbs sampling. Similarity-based clustering techniq
ues have a broad range of possible applications in computer vision, pattern
recognition, and data analysis. As a major practical application we presen
t a novel approach to the problem of unsupervised texture segmentation, whi
ch relies on statistical tests as a measure of homogeneity. The quality of
the algorithms is empirically evaluated on a large collection of Brodatz-li
ke micro-texture Mondrians and on a set of real-word images. To demonstrate
the broad usefulness of the theory of proximity based clustering the perfo
rmances of different criteria and algorithms are compared on an information
retrieval task for a document database. The superiority of optimization al
gorithms for clustering is supported by extensive experiments. (C) 2000 Pat
tern Recognition Society. Published by Elsevier Science Ltd. All rights res
erved.