Mc. Rinard, LOCALITY OPTIMIZATIONS FOR PARALLEL COMPUTING USING DATA ACCESS INFORMATION, International journal of high speed computing, 9(2), 1997, pp. 161-179
Given the large communication overheads characteristic of modern paral
lel machines, optimizations that improve locality by executing tasks c
lose to data that they will access may improve the performance of para
llel computations. This paper describes our experience automatically a
pplying locality optimizations in the context of Jade, a portable, imp
licitly parallel programming language designed for exploiting task-lev
el concurrency. Jade programmers start with a program written in a sta
ndard serial, imperative language, then use Jade constructs to declare
how parts of the program access data. The Jade implementation uses th
is data access information to automatically extract the concurrency an
d apply locality optimizations. We present performance results for sev
eral Jade applications running on the Stanford DASH machine. We use th
ese results to characterize the overall performance impact of the loca
lity optimizations. In our application set the locality optimization l
evel has Little effect on the performance of two of the applications a
nd a large effect on the performance of the rest of the applications.
We also found that, if the locality optimization level had a significa
nt effect on the performance, the maximum performance was obtained whe
n the programmer explicitly placed Basks on processors rather than rel
ying on the scheduling algorithm inside the Jade implementation.