ITA
ENG

FAST ALGORITHMS FOR UNIVERSAL QUANTIFICATION IN LARGE DATABASES

Authors

GRAEFE G COLE RL

Citation

G. Graefe et Rl. Cole, FAST ALGORITHMS FOR UNIVERSAL QUANTIFICATION IN LARGE DATABASES, ACM transactions on database systems, 20(2), 1995, pp. 187-236

Citations number

Categorie Soggetti

Computer Sciences","Computer Science Information Systems","Computer Science Software Graphycs Programming

Journal title

ACM transactions on database systems → ACNP

ISSN journal

03625915

Volume

Issue

Year of publication

1995

Pages

187 - 236

Database

ISI

SICI code

0362-5915(1995)20:2<187:FAFUQI>2.0.ZU;2-N

Abstract

Universal quantification is not supported directly in most database sy stems despite the fact that it adds significant power to a system's qu ery processing and inference capabilities, in particular for the analy sis of many-to-many relationships and of set-valued attributes. One of the main reasons for this omission has been that universal quantifica tion algorithms and their performance have not been explored for large databases. In this article, we describe and compare three known algor ithms and one recently proposed algorithm for relational division, the algebra operator that embodies universal quantification. For each alg orithm, we investigate the performance effects of explicit duplicate r emoval and referential integrity enforcement, variants for inputs larg er than memory, and parallel execution strategies. Analytical and expe rimental performance comparisons illustrate the substantial difference s among the algorithms. Moreover, comparisons demonstrate that the rec ently proposed division algorithm evaluates a universal quantification predicate over two relations as fast as hash (semi-) join evaluates a n existential quantification predicate over the same relations. Thus, existential and universal quantification can be supported with equal e fficiency by adding the recently proposed algorithm to a query evaluat ion system. A second result of our study is that universal quantificat ion should be expressed directly in a database query language because most query optimizers do not recognize the rather indirect formulation s available in SQL as relational division, and therefore produce very poor evaluation plans for many universal quantification queries.