Detecting redundant materialized views in data warehouse evolution

Authors
Citation
D. Theodoratos, Detecting redundant materialized views in data warehouse evolution, INF SYST, 26(5), 2001, pp. 363-381
Citations number
43
Categorie Soggetti
Information Tecnology & Communication Systems
Journal title
INFORMATION SYSTEMS
ISSN journal
03064379 → ACNP
Volume
26
Issue
5
Year of publication
2001
Pages
363 - 381
Database
ISI
SICI code
0306-4379(200107)26:5<363:DRMVID>2.0.ZU;2-U
Abstract
A data warehouse (DW) can be abstractly seen as a set of materialized views defined over a set of remote data sources. A DW is intended to satisfy a s et of queries. The views materialized in a DW relate to each other in a com plex manner, through common subexpressions, in order to guarantee high quer y performance and low view maintenance cost. DWs are time varying. As time passes new materialized views are added in order to satisfy new queries, or for performance reasons, while old queries are dropped. The evolution of a DW can result in a redundant set of materialized views. In this paper, we address the problem of detecting redundant materialized views in a given DW view selection, that is, materialized views that can be removed from DW wi thout negatively affecting the query evaluation or the view maintenance pro cess. Using an AND/OR dag representation for multiple queries and views, we first formalize the process of propagating source relation changes to the materialized views by exploiting common subexpressions between views and by using other materialized views that are not affected by these changes. The n, we provide an algorithm for detecting materialized views that are not ne eded in the process of propagating source relation changes to the DW. We al so show how trivially redundant views can be identified in this process. Fi nally, we use these results to provide a procedure for detecting materializ ed views that are redundant in a DW. Our approach considers a broad class o f views that includes grouping/aggregation views and is not dependent on a specific cost model. (C) 2001 Elsevier Science Ltd. All rights reserved.