Cj. Egyhazy et al., A QUERY-PROCESSING ALGORITHM FOR A SYSTEM OF HETEROGENEOUS DISTRIBUTED DATABASES, DISTRIBUTED AND PARALLEL DATABASES, 4(1), 1996, pp. 49-79
Citations number
31
Categorie Soggetti
Computer Sciences, Special Topics","Computer Science Theory & Methods","Computer Science Information Systems
This paper presents a query processing algorithm, formulated and devel
oped in support of the prototype architecture of the Distributed Acces
s View Integrated Database (DAVID) which is a heterogeneous distribute
d database management system. The objective of the proposed query proc
essing algorithm is to produce an inexpensive strategy for a given que
ry. The inexpensive query strategy is obtained primarily by computing
the most profitable semi-joins and by determining the best sequence of
join operations per processing site. The latter is obtained by applyi
ng a zero-one integer linear program that uses a non-parametric statis
tical estimation technique to compute the sizes of the temporary clust
ers. A cluster is a subset of the cartesian product of a list of atomi
c and non-atomic domains and is the structure that can represent in a
uniform way data stored in relational, hierarchical and network databa
ses. Following some background information on the development of the D
AVID prototype, this paper introduces the schema architecture. The sch
ema architecture describes the mechanism by which the component hetero
geneous database schemata are mapped into the uniform global schema. T
his is followed by the formulation of the query processing algorithm,
its implementation and an illustration of its use in the context of NA
SA's Astrophysics Data System.