D. Psaltis et al., ON THE FINITE-SAMPLE PERFORMANCE OF THE NEAREST-NEIGHBOR CLASSIFIER, IEEE transactions on information theory, 40(3), 1994, pp. 820-837
Citations number
14
Categorie Soggetti
Information Science & Library Science","Engineering, Eletrical & Electronic
The finite sample performance of a nearest neighbor classifier is anal
yzed for a two-class pattern recognition problem. An exact integral ex
pression is derived for the m-sample risk R(m) given that a reference
m-sample of labeled points is available to the classifier. The statist
ical setup assumes that the pattern classes arise in nature with fixed
a priori probabilities and that points representing the classes are d
rawn from Euclidean n-space according to fixed class-conditional proba
bility distributions. The sample is assumed to consist of m independen
tly generated class-labeled points. For a family of smooth class-condi
tional distributions characterized by asymptotic expansions in general
form, it is shown that the m-sample risk R(m) has a complete asymptot
ic series expansion [GRAPHICS] where R(infinity) denotes the nearest n
eighbor risk in the infinite-sample limit and the coefficients c(k) ar
e distribution-dependent constants independent of the sample size m. T
his analysis thus provides further analytic validation of Bellman's cu
rse of dimensionality. Numerical simulations corroborating the formal
results are included, and extensions of the theory discussed. The anal
ysis also contains a novel application of Laplace's asymptotic method
of integration to a multidimensional integral where the integrand atta
ins its maximum on a continuum of points.