ITA
ENG

A theory of proximity based clustering: structure detection by optimization

Authors

Puzicha, J Hofmann, T Buhmann, JM

Citation

J. Puzicha et al., A theory of proximity based clustering: structure detection by optimization, PATT RECOG, 33(4), 2000, pp. 617-634

Citations number

Categorie Soggetti

AI Robotics and Automatic Control

Journal title

PATTERN RECOGNITION

ISSN journal

00313203 → ACNP

Volume

Issue

Year of publication

2000

Pages

617 - 634

Database

ISI

SICI code

0031-3203(200004)33:4<617:ATOPBC>2.0.ZU;2-F

Abstract

In this paper, a systematic optimization approach for clustering proximity or similarity data is developed. Starting from Fundamental invariance and r obustness properties, a set of axioms is proposed and discussed to distingu ish different cluster compactness and separation criteria. The approach cov ers the case of sparse proximity matrices, and is extended to nested partit ionings for hierarchical data clustering. To solve the associated optimizat ion problems, a rigorous mathematical framework for deterministic annealing and mean-field approximation is presented. Efficient optimization heuristi cs are derived in a canonical way, which also clarifies the relation to sto chastic optimization by Gibbs sampling. Similarity-based clustering techniq ues have a broad range of possible applications in computer vision, pattern recognition, and data analysis. As a major practical application we presen t a novel approach to the problem of unsupervised texture segmentation, whi ch relies on statistical tests as a measure of homogeneity. The quality of the algorithms is empirically evaluated on a large collection of Brodatz-li ke micro-texture Mondrians and on a set of real-word images. To demonstrate the broad usefulness of the theory of proximity based clustering the perfo rmances of different criteria and algorithms are compared on an information retrieval task for a document database. The superiority of optimization al gorithms for clustering is supported by extensive experiments. (C) 2000 Pat tern Recognition Society. Published by Elsevier Science Ltd. All rights res erved.