A data warehouse (DW) can be abstractly seen as a set of materialized views
defined over a set of remote data sources. A DW is intended to satisfy a s
et of queries. The views materialized in a DW relate to each other in a com
plex manner, through common subexpressions, in order to guarantee high quer
y performance and low view maintenance cost. DWs are time varying. As time
passes new materialized views are added in order to satisfy new queries, or
for performance reasons, while old queries are dropped. The evolution of a
DW can result in a redundant set of materialized views. In this paper, we
address the problem of detecting redundant materialized views in a given DW
view selection, that is, materialized views that can be removed from DW wi
thout negatively affecting the query evaluation or the view maintenance pro
cess. Using an AND/OR dag representation for multiple queries and views, we
first formalize the process of propagating source relation changes to the
materialized views by exploiting common subexpressions between views and by
using other materialized views that are not affected by these changes. The
n, we provide an algorithm for detecting materialized views that are not ne
eded in the process of propagating source relation changes to the DW. We al
so show how trivially redundant views can be identified in this process. Fi
nally, we use these results to provide a procedure for detecting materializ
ed views that are redundant in a DW. Our approach considers a broad class o
f views that includes grouping/aggregation views and is not dependent on a
specific cost model. (C) 2001 Elsevier Science Ltd. All rights reserved.