On finding the number of clusters

Citation
R. Kothari et D. Pitts, On finding the number of clusters, PATT REC L, 20(4), 1999, pp. 405-416
Citations number
12
Categorie Soggetti
AI Robotics and Automatic Control
Journal title
PATTERN RECOGNITION LETTERS
ISSN journal
01678655 → ACNP
Volume
20
Issue
4
Year of publication
1999
Pages
405 - 416
Database
ISI
SICI code
0167-8655(199904)20:4<405:OFTNOC>2.0.ZU;2-X
Abstract
We present a novel approach to finding the number of clusters in data based on the minimization of a regularized cost function. Minimization of the pr oposed cost function results in the minimization of the sum-of-squared dist ances of the data points from the respective nearest cluster center as well as the sum-of-squared distances of the individual cluster centers from nei ghborhood cluster centers. Smaller values of the neighborhood encourage the formation of more distinct cluster centers, while larger values of the nei ghborhood encourage the formation of fewer distinct cluster centers. We ide ntify the neighborhood as a scale parameter and obtain the number of cluste r centers at varying values of the scale parameter. The number of cluster c enters in the data is then obtained based on persistence over the largest r ange of the scale parameter. Four simulations are presented to illustrate t he efficacy of the proposed algorithm. (C) 1999 Elsevier Science B.V. All r ights reserved.