ITA
ENG

Approximating multi-dimensional aggregate range queries over real attributes

Authors

Gunopulos, D Kollios, G Domeniconi, C Tsotras, VJ

Citation

D. Gunopulos et al., Approximating multi-dimensional aggregate range queries over real attributes, SIG RECORD, 29(2), 2000, pp. 463-474

Citations number

Categorie Soggetti

Computer Science & Engineering

Journal title

SIGMOD RECORD

ISSN journal

01635808 → ACNP

Volume

Issue

Year of publication

2000

Pages

463 - 474

Database

ISI

SICI code

0163-5808(200006)29:2<463:AMARQO>2.0.ZU;2-M

Abstract

Finding approximate answers to multi-dimensional range queries over real va lued attributes has significant applications in data exploration and databa se query optimization. In this paper we consider the following problem: giv en a table of d attributes whose domain is the real numbers, and a query th at specifies a range in each dimension, find a good approximation of the nu mber of records in the table that satisfy the query. We present a new histogram technique that is designed to approximate the de nsity of multi-dimensional datasets with real attributes. Our technique fin ds buckets of variable size, and allows the buckets to overlap. Overlapping buckets allow more efficient approximation of the density. The size of the cells is based on the local density of the data. This technique leads to a faster and more compact approximation of the data distribution. We also sh ow how to generalize kernel density estimators, and how to apply them on th e multi-dimensional query approximation problem. Finally, we compare the accuracy of the proposed techniques with existing t echniques using real and synthetic datasets.