ITA
ENG

Fuzzy c-means clustering of incomplete data

Authors

Hathaway, RJ Bezdek, JC

Citation

Rj. Hathaway et Jc. Bezdek, Fuzzy c-means clustering of incomplete data, IEEE SYST B, 31(5), 2001, pp. 735-744

Citations number

Categorie Soggetti

AI Robotics and Automatic Control

Journal title

IEEE TRANSACTIONS ON SYSTEMS MAN AND CYBERNETICS PART B-CYBERNETICS

ISSN journal

10834419 → ACNP

Volume

Issue

Year of publication

2001

Pages

735 - 744

Database

ISI

SICI code

1083-4419(200110)31:5<735:FCCOID>2.0.ZU;2-7

Abstract

The problem of clustering a real s-dimensional data set X = {x(1)..., x(n)} subset of R-s is considered. Usually, each observation (or datum) consists of numerical values for all s features (such as height, length, etc.), but sometimes data sets can contain vectors that are missing one or more of th e feature values. For example, a particular datum xk might be incomplete, h aving the form x(k) = (254.3, ?, 333.2, 47.44, ?)(T), where the second and fifth feature values are missing. The fuzzy e-means (FCM) algorithm is a us eful tool for clustering real s-dimensional data, but it is not directly ap plicable to the case of incomplete data. Four strategies for doing FCM clus tering of incomplete data sets are given, three of which involve modified v ersions of the FCM algorithm. Numerical convergence properties of the new a lgorithms are discussed, and all approaches are tested using real and artif icially generated incomplete data sets.