Applicable to arbitrary sequences and nests of loops, affine partitioning i
s a program transformation framework that unifies many previously proposed
loop transformations, including unimodular transforms, fusion, fission, rei
ndexing, scaling and statement reordering. Algorithms based on affine parti
tioning have been shown to be effective for parallelization and communicati
on minimization. This paper presents algorithms that improve data locality
using affine partitioning.
Blocking and array contraction are two important optimizations that have be
en shown to be useful for data locality. Blocking creates a set of inner lo
ops so that data brought into the faster levels of the memory hierarchy can
be reused. Array contraction reduces an array to a scalar variable and the
reby reduces the number of memory operations executed and the memory footpr
int. Loop transforms axe often necessary to make blocking and array contrac
tion possible.
By bringing the full generality of affine partitioning to bear on the probl
em, our locality algorithm can find more contractable arrays than previousl
y possible. This paper also generalizes the concept of blocking and shows t
hat affine partitioning allows the benefits of blocking be realized in arbi
trarily nested loops. Experimental results on a number of benchmarks and a
complete multigrid application in aeronautics indicates that affine partiti
oning is effective in practice.