HOW TO CHOOSE A REPRESENTATIVE SUBSET FROM A SET OF DATA IN MULTIDIMENSIONAL SPACE

Authors
Citation
Bb. Chaudhuri, HOW TO CHOOSE A REPRESENTATIVE SUBSET FROM A SET OF DATA IN MULTIDIMENSIONAL SPACE, Pattern recognition letters, 15(9), 1994, pp. 893-899
Citations number
6
Categorie Soggetti
Computer Sciences, Special Topics","Computer Science Artificial Intelligence
Journal title
ISSN journal
01678655
Volume
15
Issue
9
Year of publication
1994
Pages
893 - 899
Database
ISI
SICI code
0167-8655(1994)15:9<893:HTCARS>2.0.ZU;2-V
Abstract
Given a set of N points in multi-dimensional space, it may be necessar y to choose a subset of n representative points. For example, in clust ering problems, it is necessary to choose a few seed points around whi ch the cluster may grow. This problem may be posed as that of choosing one out of each k data when right perpendicular N/n left perpendicula r=k. In our proposed method, the data points are ordered in decreasing magnitude of density. The datum toping the ordered list is chosen and its k-1 nearest neighbours are deleted from the ordered list. From th e remaining data, the one currently toping the list is chosen. The pro cess is repeated till the data are exhausted. The problem of more gene ral choice of n is also addressed.