A feature selection method is proposed to select a subset of variables in s
equential projection pursuit (SPP) analysis in order to preserve as much sa
mple clustering information as possible. The inhomogeneity of the complete
data is explored by SPP, and the retained inhomogeneity information of a ca
ndidate subset is measured by means of the percentage of consensus in gener
alised procrustes analysis. The best subset is obtained by applying a genet
ic algorithm (GA) which optimises the consensus between the subset and the
complete data set. An improved algorithm is proposed which enables analysis
of high-dimensional data. The method was studied on three high-dimensional
industrial data sets. The results show that the proposed method successful
ly identified inhomogeneity-bearing variables and leads to better subsets o
f variables than the other studied feature selection methods in preserving
interesting clustering information. (C) 2001 Elsevier Science B.V. All righ
ts reserved.