Compiling source descriptions for efficient and flexible information integration

Citation
Jl. Ambite et al., Compiling source descriptions for efficient and flexible information integration, J INTELL IN, 16(2), 2001, pp. 149-187
Citations number
40
Categorie Soggetti
Information Tecnology & Communication Systems
Journal title
JOURNAL OF INTELLIGENT INFORMATION SYSTEMS
ISSN journal
09259902 → ACNP
Volume
16
Issue
2
Year of publication
2001
Pages
149 - 187
Database
ISI
SICI code
0925-9902(2001)16:2<149:CSDFEA>2.0.ZU;2-6
Abstract
Integrating data from heterogeneous data sources is a critical problem that has received a great deal of attention in recent years. There are two comp eting approaches to address this problem. The traditional approach, which f irst appeared in Multibase and more recently in HERMES and TSIMMIS, often c alled global-as-view, defines the global model as a view on the sources. A more recent approach, sometimes referred to as local-as-view or view rewrit ing, defines the sources as views on the global model. The disadvantage of the first approach is that a person must re-engineer the definitions of the global model whenever any of the sources change or when new sources are ad ded. The view rewriting approach does not suffer from this drawback, but th e problem of rewriting queries into equivalent plans using views is computa tionally hard and must be performed for each query at run-time. In this paper we propose a hybrid approach that amortizes the cost of query processing over all queries by pre-compiling the source descriptions into a minimal set of integration axioms. Using this approach, the sources are d efined in terms of the global model and then compiled into axioms that defi ne the global model in terms of the sources. These axioms can be efficientl y instantiated at run-time to determine the most appropriate rewriting to a nswer a query and facilitate traditional cost-based query optimization. Our approach combines the flexibility of the local-as-view approach with the r un-time efficiency of the query processing in global-as-view systems. We ha ve implemented this approach for the SIMS and Ariadne information mediators and provide empirical results that demonstrate that in practice the approa ch scales to large numbers of sources and that the approach can compile the axioms for a variety of real-world domains in a matter of seconds.