ITA
ENG

The state of the art in distributed query processing

Authors

Kossmann, D

Citation

D. Kossmann, The state of the art in distributed query processing, ACM C SURV, 32(4), 2000, pp. 422-469

Citations number

162

Categorie Soggetti

Computer Science & Engineering

Journal title

ACM COMPUTING SURVEYS

ISSN journal

03600300 → ACNP

Volume

Issue

Year of publication

2000

Pages

422 - 469

Database

ISI

SICI code

0360-0300(200012)32:4<422:TSOTAI>2.0.ZU;2-J

Abstract

Distributed data processing is becoming a reality. Businesses want to do it for many reasons, and they often must do it in order to stay competitive. While much of the infrastructure for distributed data processing is already there (e.g., modern network technology), a number of issues make distribut ed data processing still a complex undertaking: (1) distributed systems can become very large, involving thousands of heterogeneous sites including PC s and mainframe server machines; (2) the state of a distributed system chan ges rapidly because the load of sites varies over time and new sites are ad ded to the system; (3) legacy systems need to be integrated-such legacy sys tems usually have not been designed for distributed data processing and now need to interact with other (modern) systems in a distributed environment. This paper presents the state of the art of query processing for distribute d database and information systems. The paper presents the "textbook" archi tecture for distributed query processing and a series of techniques that ar e particularly useful for distributed database systems. These techniques in clude special join techniques, techniques to exploit intraquery parallelism , techniques to reduce communication costs, and techniques to exploit cachi ng and replication of data. Furthermore, the paper discusses different kind s of distributed systems such as client-server, middleware (multitier), and heterogeneous database systems, and shows how query processing works in th ese systems.