Loop blocking (tiling) is a well-known compiler optimization that helps imp
rove cache performance by dividing the loop iteration space into smaller bl
ocks (tiles); reuse of array elements within each tile is maximized by ensu
ring that the working set for the tile fits into the data cache. Padding is
a data alignment technique that involves the insertion of dummy elements i
nto a data structure for improving cache performance. In this work, we pres
ent DAT, a technique that augments loop tiling with data alignment, achievi
ng improved efficiency (by ensuring that the cache is never under-utilized)
as well as improved flexibility (by eliminating self-interference cache co
nflicts independent of the tile size). This results in a more stable and be
tter cache performance than existing approaches, in addition to maximizing
cache utilization, eliminating Self-interference, and minimizing cross-inte
rference conflicts. Further, while all previous efforts are targetted at pr
ograms characterized by the reuse of a single array, we also address the is
sue of minimizing conflict misses when several tiled arrays are involved. T
o validate our technique, we ran extensive experiments using both simulatio
ns as well as actual measurements on SUN Sparc5 and Sparc10 workstations. T
he results on benchmarks exhibiting varying memory access patterns demonstr
ate the effectiveness of our technique through consistently high hit ratios
and improved performance across varying problem sizes.