M. Rodriguez-martinez et N. Roussopoulos, MOCHA: A self-extensible database middleware system for distributed data sources, SIG RECORD, 29(2), 2000, pp. 213-224
We present MOCHA, a new self-extensible database middleware system designed
to interconnect distributed data sources. MOCHA is designed to scale to la
rge environments and is based on the idea that some of the user-defined fun
ctionality in the system should be deployed by the middleware system itself
. This is realized by shipping Java code implementing either advanced data
types or tailored query operators to remote data sources and have it execut
ed remotely. Optimized query plans push the evaluation of powerful data-red
ucing operators to the data source sites while executing data-inflating ope
rators near the client's site. The Volume Reduction Factor is a new and mor
e explicit metric introduced in this paper to select the best site to execu
te query operators and is shown to be more accurate than the standard selec
tivity factor alone. MOCHA has been implemented in Java and runs on top of
Informix and Oracle. We present the architecture of MOCHA, the ideas behind
it, and a performance study using scientific data and queries. The results
of this study demonstrate that MOCHA provides a more flexible, scalable an
d efficient framework for distributed query processing compared to those in
existing middleware solutions.