Improving cache locality by a combination of loop and data transformations

Citation
M. Kandemir et al., Improving cache locality by a combination of loop and data transformations, IEEE COMPUT, 48(2), 1999, pp. 159-167
Citations number
21
Categorie Soggetti
Computer Science & Engineering
Journal title
IEEE TRANSACTIONS ON COMPUTERS
ISSN journal
00189340 → ACNP
Volume
48
Issue
2
Year of publication
1999
Pages
159 - 167
Database
ISI
SICI code
0018-9340(199902)48:2<159:ICLBAC>2.0.ZU;2-N
Abstract
Exploiting locality of reference is key to realizing high levels of perform ance on modern processors. This paper describes a compiler algorithm for op timizing cache locality in scientific codes on uniprocessor and multiproces sor machines. A distinctive characteristic of our algorithm is that it cons iders loop and data layout transformations in a unified framework. Our appr oach is very effective at reducing cache misses and can optimize some nests for which optimization techniques based on loop transformations alone are not successful. An important special case is one in which data layouts of s ome arrays are fixed and cannot be changed. We show how our algorithm can a ccommodate this case and demonstrate how it can be used to optimize multipl e loop nests. Experiments on several benchmarks show that the techniques pr esented in this paper result in substantial improvement in cache performanc e.