IMPROVING DATA LOCALITY WITH LOOP TRANSFORMATIONS

Citation
Ks. Mckinley et al., IMPROVING DATA LOCALITY WITH LOOP TRANSFORMATIONS, ACM transactions on programming languages and systems, 18(4), 1996, pp. 424-453
Citations number
31
Categorie Soggetti
Computer Sciences","Computer Science Software Graphycs Programming
ISSN journal
01640925
Volume
18
Issue
4
Year of publication
1996
Pages
424 - 453
Database
ISI
SICI code
0164-0925(1996)18:4<424:IDLWLT>2.0.ZU;2-S
Abstract
In the past decade, processor speed has become significantly faster th an memory speed. Small, fast cache memories are designed to overcome t his discrepancy, but they are only effective when programs exhibit dat a locality. In this article, we present compiler optimizations to impr ove data locality based on a simple yet accurate cost model. The model computes both temporal and spatial reuse of cache lines to find desir able loop organizations. The cost model drives the application of comp ound transformations consisting of loop permutation, loop fusion, loop distribution, and loop reversal. We demonstrate that these program tr ansformations are useful for optimizing many programs. To validate our optimization strategy, we implemented our algorithms and ran experime nts on a large collection of scientific programs and kernels. Experime nts illustrate that for kernels our model and algorithm can select and achieve the best loop structure for a nest. For over 30 complete appl ications, we executed the original and transformed versions and simula ted cache hit rates. We collected statistics about the inherent charac teristics of these programs and our ability to improve their data loca lity. To our knowledge, these studies are the first of such breadth an d depth. We found performance improvements were difficult to achieve b ecause benchmark programs typically have high hit rates even for small data caches; however, our optimizations significantly improved severa l programs.