Cache investment: Integrating query optimization and distributed data placement

Citation
D. Kossmann et al., Cache investment: Integrating query optimization and distributed data placement, ACM T DATAB, 25(4), 2000, pp. 517-558
Citations number
46
Categorie Soggetti
Computer Science & Engineering
Journal title
ACM TRANSACTIONS ON DATABASE SYSTEMS
ISSN journal
03625915 → ACNP
Volume
25
Issue
4
Year of publication
2000
Pages
517 - 558
Database
ISI
SICI code
0362-5915(200012)25:4<517:CIIQOA>2.0.ZU;2-G
Abstract
Emerging distributed query-processing systems support flexible execution st rategies in which each query can be run using a combination of data shippin g and query shipping. As in any distributed environment, these systems can obtain tremendous performance and availability benefits by employing dynami c data caching. When flexible execution and dynamic caching are combined, h owever, a circular dependency arises: Caching occurs as a by-product of que ry operator placement, but query operator placement decisions are based on (cached) data location. The practical impact of this dependency is that que ry optimization decisions that appear valid on a per-query basis can actual ly cause suboptimal performance for all queries in the long run. To address this problem, we developed Cache Investment-a novel approach for integrating query optimization and data placement that looks beyond the pe rformance of a single query. Cache Investment sometimes intentionally gener ates a "suboptimal" plan for a particular query in the interest of effectin g a better data placement for subsequent queries. Cache Investment can be i ntegrated into a distributed database system without changing the internals of the query optimizer. In this paper, we propose Cache Investment mechani sms and policies and analyze their performance. The analysis uses results f rom both an implementation on the SHORE storage manager and a detailed simu lation model. Our results show that Cache Investment can significantly impr ove the overall performance of a system and demonstrate the trade-offs amon g various alternative policies.