It is often too expensive to compute and materialize a complete high-dimens
ional data cube. Computing an iceberg cube, which contains only aggregates
above certain thresholds, is an effective way to derive nontrivial multidim
ensional aggregations for OLAP and data mining.
In this paper, we study efficient methods for computing iceberg cubes with
some popularly used complex measures, such as average, and develop a method
ology that adopts a weaker but anti-monotonic condition for testing and pru
ning search space. In particular, for efficient computation of iceberg cube
s with the average measure, we propose a top-k average pruning method and e
xtend two previously studied methods, Apriori and BUC, to Top-k Apriori and
Top-k BUC. To further improve the performance, an interesting hypertree st
ructure, called H-tree, is designed and a new iceberg cubing method, called
Top-k H-Cubing, is developed. Our performance study shows that Top-k BUC a
nd Top-k H-Cubing are two promising candidates for scalable computation, an
d Top-k H-Cubing has better performance in most cases.