ITA
ENG

ACCOUNTING FOR BOUNDARY EFFECTS IN NEAREST-NEIGHBOR SEARCHING

Authors

ARYA S MOUNT DM NARAYAN O

Citation

S. Arya et al., ACCOUNTING FOR BOUNDARY EFFECTS IN NEAREST-NEIGHBOR SEARCHING, Discrete & computational geometry, 16(2), 1996, pp. 155-176

Citations number

Categorie Soggetti

Computer Sciences, Special Topics","Mathematics, General","Computer Science Theory & Methods",Mathematics

Journal title

Discrete & computational geometry → ACNP

ISSN journal

01795376

Volume

Issue

Year of publication

1996

Pages

155 - 176

Database

ISI

SICI code

0179-5376(1996)16:2<155:AFBEIN>2.0.ZU;2-Y

Abstract

Given n data points in d-dimensional space, nearest-neighbor searching involves determining the nearest of these data points to a given quer y point. Most average-case analyses of nearest-neighbor searching algo rithms are made under the simplifying assumption that d is fixed and t hat n is so large relative to d that boundary effects can be ignored. This means that for any query point the statistical distribution of th e data points surrounding it is independent of the location of the que ry point. However, in many applications of nearest-neighbor searching (such as data compression by vector quantization) this assumption is n ot met, since the number of data points n grows roughly as 2(d). Large ly for this reason, the actual performances of many nearest-neighbor a lgorithms tend to be much better than their theoretical analyses would suggest. We present evidence of why this is the case. We provide an a ccurate analysis of the number of cells visited in nearest-neighbor se arching by the bucketing and k-d tree algorithms. We assume m(d) point s uniformly distributed in dimension d, where m is a fixed integer gre ater than or equal to 2. Further, we assume that distances are measure d in the L(infinity) metric. Our analysis is tight in the limit as d a pproaches infinity. Empirical evidence is presented showing that the a nalysis applies even in low dimensions.