A QUERY-PROCESSING ALGORITHM FOR A SYSTEM OF HETEROGENEOUS DISTRIBUTED DATABASES

Citation
Cj. Egyhazy et al., A QUERY-PROCESSING ALGORITHM FOR A SYSTEM OF HETEROGENEOUS DISTRIBUTED DATABASES, DISTRIBUTED AND PARALLEL DATABASES, 4(1), 1996, pp. 49-79
Citations number
31
Categorie Soggetti
Computer Sciences, Special Topics","Computer Science Theory & Methods","Computer Science Information Systems
ISSN journal
09268782
Volume
4
Issue
1
Year of publication
1996
Pages
49 - 79
Database
ISI
SICI code
0926-8782(1996)4:1<49:AQAFAS>2.0.ZU;2-#
Abstract
This paper presents a query processing algorithm, formulated and devel oped in support of the prototype architecture of the Distributed Acces s View Integrated Database (DAVID) which is a heterogeneous distribute d database management system. The objective of the proposed query proc essing algorithm is to produce an inexpensive strategy for a given que ry. The inexpensive query strategy is obtained primarily by computing the most profitable semi-joins and by determining the best sequence of join operations per processing site. The latter is obtained by applyi ng a zero-one integer linear program that uses a non-parametric statis tical estimation technique to compute the sizes of the temporary clust ers. A cluster is a subset of the cartesian product of a list of atomi c and non-atomic domains and is the structure that can represent in a uniform way data stored in relational, hierarchical and network databa ses. Following some background information on the development of the D AVID prototype, this paper introduces the schema architecture. The sch ema architecture describes the mechanism by which the component hetero geneous database schemata are mapped into the uniform global schema. T his is followed by the formulation of the query processing algorithm, its implementation and an illustration of its use in the context of NA SA's Astrophysics Data System.