Analysis of range queries and self-spatial join queries on real region datasets stored using an R-tree

Citation
G. Proietti et C. Faloutsos, Analysis of range queries and self-spatial join queries on real region datasets stored using an R-tree, IEEE KNOWL, 12(5), 2000, pp. 751-762
Citations number
20
Categorie Soggetti
AI Robotics and Automatic Control
Journal title
IEEE TRANSACTIONS ON KNOWLEDGE AND DATA ENGINEERING
ISSN journal
10414347 → ACNP
Volume
12
Issue
5
Year of publication
2000
Pages
751 - 762
Database
ISI
SICI code
1041-4347(200009/10)12:5<751:AORQAS>2.0.ZU;2-H
Abstract
In this paper, we study the node distribution of an R-tree storing region d ata, like, for instance, islands, lakes, or human-inhabited areas. We will show that real region datasets are packed in an R-tree into minimum boundin g rectangles (MBRs) whose area distribution follows the same power law, nam ed REGAL (REGion Area Law), as that for the regions themselves. Moreover, t hese MBRs are packed in their turn into MBRs following the same law, and so on iteratively, up to the root of the R-tree. Based on this observation. w e are able to accurately estimate the search effort for range queries, usin g a small number of easy-to-retrieve parameters. Furthermore, since our ana lysis exploits, through a realistic mathematical model, the proximity relat ions existing among the regions in the dataset, we show how to use our mode l to predict the selectivity of a self-spatial join query posed on the data set. Experiments on a variety of real datasets (islands, lakes, human-inhab ited areas) show that our estimations are accurate, enjoying a geometric av erage relative error ranging from 22 percent to 32 percent for the search e ffort of a range query, and from 14 percent to 34 percent for the selectivi ty of a self-spatial join query. This is significantly better than using a naive model based on uniformity assumption, which gives rise to a geometric average relative error up to 270 percent and up to 85 percent for the two problems, respectively.