Semantic query optimization for query plans of heterogeneous multidatabasesystems

Citation
Cn. Hsu et Ca. Knoblock, Semantic query optimization for query plans of heterogeneous multidatabasesystems, IEEE KNOWL, 12(6), 2000, pp. 959-978
Citations number
37
Categorie Soggetti
AI Robotics and Automatic Control
Journal title
IEEE TRANSACTIONS ON KNOWLEDGE AND DATA ENGINEERING
ISSN journal
10414347 → ACNP
Volume
12
Issue
6
Year of publication
2000
Pages
959 - 978
Database
ISI
SICI code
1041-4347(200011/12)12:6<959:SQOFQP>2.0.ZU;2-A
Abstract
New applications of information systems, such as electronic commerce and he althcare information systems, need to integrate a large number of heterogen eous databases over computer networks. Answering a query in these applicati ons usually involves selecting relevant information sources and generating a query plan to combine the data automatically. As significant progress has been made in source selection and plan generation, the critical issue has been shifting to query optimization. This paper presents a semantic query o ptimization (SQO) approach to optimizing query plans of heterogeneous multi database systems. This approach provides global optimization for query plan s as well as local optimization for subqueries that retrieve data from indi vidual database sources. An important feature of our local optimization alg orithm is that we prove necessary and sufficient conditions to eliminate an unnecessary join in a conjunctive query of arbitrary join topology. This f eature allows our optimizer to utilize more expressive relational rules to provide a wider range of possible optimizations than previous work in SQO. The local optimization algorithm also features a new data structure called AND-OR implication graphs to facilitate the search for optimal queries. The se features allow the global optimization to effectively use semantic knowl edge to reduce data transmission cost. We have implemented this approach in to the PESTO query plan optimizer as a part of the SIMS information mediato r. Experimental results demonstrate that PESTO can provide significant savi ngs in query execution cost over query plan execution without optimization.