W. Wu et Dl. Massart, REGULARIZED NEAREST-NEIGHBOR CLASSIFICATION METHOD FOR PATTERN-RECOGNITION OF NEAR-INFRARED SPECTRA, Analytica chimica acta, 349(1-3), 1997, pp. 253-261
When a data set contains a high number of variables compared to the nu
mber of objects, the k nearest neighbour classification method (kNN) c
annot be applied with the Mahalanobis distance as similarity criterion
. To solve this problem, kNN is modified to the regularised nearest ne
ighbour classification method (RNN) by using the regularised covarianc
e matrix in the Mahalanobis distance in the same way that LDA and/or Q
DA are modified to regularised discriminant analysis (RDA). Four simul
ated data sets and 14 real NIR data sets were studied to compare the n
ew method with the classical kNN using Euclidean and Mahalanobis dista
nces. Our results demonstrate that RNN improves the classification res
ults by regularising the class covariance matrix in all kinds of data
sets. When the ratio of variables to objects is very high, RNN cannot
be directly applied. The data dimensionality must be reduced before us
ing RNN. This is, for instance, the case when one wants to apply the m
ethod to classification of NIR spectra. Compared with RDA, RNN perform
s better than RDA when the normal distribution assumption is severely
violated, and worse when the data are normally distributed, since it i
s a non-parametric version of RDA. However, it is more time consuming
than RDA.