FAST ALGORITHMS FOR UNIVERSAL QUANTIFICATION IN LARGE DATABASES

Authors
Citation
G. Graefe et Rl. Cole, FAST ALGORITHMS FOR UNIVERSAL QUANTIFICATION IN LARGE DATABASES, ACM transactions on database systems, 20(2), 1995, pp. 187-236
Citations number
33
Categorie Soggetti
Computer Sciences","Computer Science Information Systems","Computer Science Software Graphycs Programming
ISSN journal
03625915
Volume
20
Issue
2
Year of publication
1995
Pages
187 - 236
Database
ISI
SICI code
0362-5915(1995)20:2<187:FAFUQI>2.0.ZU;2-N
Abstract
Universal quantification is not supported directly in most database sy stems despite the fact that it adds significant power to a system's qu ery processing and inference capabilities, in particular for the analy sis of many-to-many relationships and of set-valued attributes. One of the main reasons for this omission has been that universal quantifica tion algorithms and their performance have not been explored for large databases. In this article, we describe and compare three known algor ithms and one recently proposed algorithm for relational division, the algebra operator that embodies universal quantification. For each alg orithm, we investigate the performance effects of explicit duplicate r emoval and referential integrity enforcement, variants for inputs larg er than memory, and parallel execution strategies. Analytical and expe rimental performance comparisons illustrate the substantial difference s among the algorithms. Moreover, comparisons demonstrate that the rec ently proposed division algorithm evaluates a universal quantification predicate over two relations as fast as hash (semi-) join evaluates a n existential quantification predicate over the same relations. Thus, existential and universal quantification can be supported with equal e fficiency by adding the recently proposed algorithm to a query evaluat ion system. A second result of our study is that universal quantificat ion should be expressed directly in a database query language because most query optimizers do not recognize the rather indirect formulation s available in SQL as relational division, and therefore produce very poor evaluation plans for many universal quantification queries.