The problem of clustering a real s-dimensional data set X = {x(1)..., x(n)}
subset of R-s is considered. Usually, each observation (or datum) consists
of numerical values for all s features (such as height, length, etc.), but
sometimes data sets can contain vectors that are missing one or more of th
e feature values. For example, a particular datum xk might be incomplete, h
aving the form x(k) = (254.3, ?, 333.2, 47.44, ?)(T), where the second and
fifth feature values are missing. The fuzzy e-means (FCM) algorithm is a us
eful tool for clustering real s-dimensional data, but it is not directly ap
plicable to the case of incomplete data. Four strategies for doing FCM clus
tering of incomplete data sets are given, three of which involve modified v
ersions of the FCM algorithm. Numerical convergence properties of the new a
lgorithms are discussed, and all approaches are tested using real and artif
icially generated incomplete data sets.