Spatial join selectivity using power laws

Citation
C. Faloutsos et al., Spatial join selectivity using power laws, SIG RECORD, 29(2), 2000, pp. 177-188
Citations number
35
Categorie Soggetti
Computer Science & Engineering
Journal title
SIGMOD RECORD
ISSN journal
01635808 → ACNP
Volume
29
Issue
2
Year of publication
2000
Pages
177 - 188
Database
ISI
SICI code
0163-5808(200006)29:2<177:SJSUPL>2.0.ZU;2-T
Abstract
We discovered a surprising law governing the spatial join selectivity acros s two sets of points. An example of such a spatial join is " find the libra ries that are within 10 miles of schools". Our law dictates that the number of such qualifying pairs follows a power law, whose exponent we call "pair -count exponent" (PC). We show that this law also holds for self-spatial-jo ins ("find schools within 5 miles of other schools") in addition to the gen eral case that the two point-sets are distinct. Our law holds for many real datasets, including diverse environments (geographic datasets, feature vec tors from biology data, galaxy data from astronomy). In addition, we introduce the concept of the Box-Occupancy-Product-Sum (BOP S) plot, and we show that it can compute the pair-count exponent in a timel y manner, reducing the run time by orders of magnitude, from quadratic to l inear. Due to the pair-count exponent and our analysis (Law 1), we can achi eve accurate selectivity estimates in constant time (O(1)) without the need for sampling or other expensive operations. The relative error in selectiv ity is about 30% with our fast BOPS method, and even better (about 10%), if we use the slower, quadratic method.