Static and dynamic locality optimizations using integer linear programming

Citation
M. Kandemir et al., Static and dynamic locality optimizations using integer linear programming, IEEE PARALL, 12(9), 2001, pp. 922-941
Citations number
61
Categorie Soggetti
Computer Science & Engineering
Journal title
IEEE TRANSACTIONS ON PARALLEL AND DISTRIBUTED SYSTEMS
ISSN journal
10459219 → ACNP
Volume
12
Issue
9
Year of publication
2001
Pages
922 - 941
Database
ISI
SICI code
1045-9219(200109)12:9<922:SADLOU>2.0.ZU;2-Q
Abstract
The delivered performance on modern processors that employ deep memory hier archies is closely related to the performance of the memory subsystem. Comp iler optimizations aimed at improving cache locality are critical in realiz ing the performance potential of powerful processors. For scientific applic ations, several loop transformations have been shown to be useful in improv ing both temporal and spatial locality. Recently, there has been some work in the area of data layout optimizations, i.e., changing the memory layouts of multidimensional arrays from the language-defined default such as colum n-major storage in Fortran. The effect of such memory layout decisions is o n the spatial locality characteristics of loop nests. While data layout tra nsformations are not constrained by data dependences, they have no effect o n temporal locality. On the other hand, loop transformations are not readil y applicable to imperfect loop nests and are constrained by data dependence s. More importantly, loop transformations affect the memory access patterns of all the arrays accessed in a loop nest and, as a result, the locality c haracteristics of some of the arrays may worsen. This paper presents a tech nique based on integer linear programming (ILP) that attempts to derive the best combination of loop and data layout transformations. Prior attempts t o unify loop and data layout transformations for programs consisting of a s equence of loop nests have been based on heuristics not only for transforma tions for a single loop nest but also for the sequence in which loop nests will be considered. The ILP formulation presented here obviates the need fo r such heuristics and gives us a bar against which the heuristic algorithms can be compared, More importantly, our approach is able to transform memor y layouts dynamically during program execution. This is particularly useful in applications whose disjoint code segments demand different layouts for a given array. In addition, we show how this formulation can be extended to address the false sharing problem in a multiprocessor environment. The key data structure we introduce is the memory layout graph (MLG) that allows u s to formulate the problems as path problems. The paper discusses the relat ionship of this ILP approach based on the memory layout graphs to other wor k in the area including our previous work. Experimental results on a MIPS R 10000-based system demonstrate the benefits of this approach and show that the use of the ILP formulation does not increase the compilation time signi ficantly.