Integrating data from heterogeneous data sources is a critical problem that
has received a great deal of attention in recent years. There are two comp
eting approaches to address this problem. The traditional approach, which f
irst appeared in Multibase and more recently in HERMES and TSIMMIS, often c
alled global-as-view, defines the global model as a view on the sources. A
more recent approach, sometimes referred to as local-as-view or view rewrit
ing, defines the sources as views on the global model. The disadvantage of
the first approach is that a person must re-engineer the definitions of the
global model whenever any of the sources change or when new sources are ad
ded. The view rewriting approach does not suffer from this drawback, but th
e problem of rewriting queries into equivalent plans using views is computa
tionally hard and must be performed for each query at run-time.
In this paper we propose a hybrid approach that amortizes the cost of query
processing over all queries by pre-compiling the source descriptions into
a minimal set of integration axioms. Using this approach, the sources are d
efined in terms of the global model and then compiled into axioms that defi
ne the global model in terms of the sources. These axioms can be efficientl
y instantiated at run-time to determine the most appropriate rewriting to a
nswer a query and facilitate traditional cost-based query optimization. Our
approach combines the flexibility of the local-as-view approach with the r
un-time efficiency of the query processing in global-as-view systems. We ha
ve implemented this approach for the SIMS and Ariadne information mediators
and provide empirical results that demonstrate that in practice the approa
ch scales to large numbers of sources and that the approach can compile the
axioms for a variety of real-world domains in a matter of seconds.