M. Kandemir et al., A unified framework for optimizing locality, parallelism, and communication in out-of-core computations, IEEE PARALL, 11(7), 2000, pp. 648-668
Citations number
45
Categorie Soggetti
Computer Science & Engineering
Journal title
IEEE TRANSACTIONS ON PARALLEL AND DISTRIBUTED SYSTEMS
This paper presents a unified framework that optimizes out-of-core programs
by exploiting locality and parallelism, and reducing communication overhea
d. For out-of-core problems where the data set sizes far exceed the size of
the available in-core memory, it is particularly important to exploit the
memory hierarchy by optimizing the I/O accesses. We present algorithms that
consider both iteration space (loop) and data space (file layout) transfor
mations in a unified framework, We show that the performance of an out-of-c
ore loop nest containing references to out-of-core arrays can be improved b
y using a suitable combination of file layout choices and loop restructurin
g transformations. Our approach considers array references one-by-one and a
ttempts to optimize each reference for parallelism and locality. When there
are references for which parallelism optimizations do not work, communicat
ion is vectorized so that data transfer can be performed before the innermo
st loop. Results from hand-compiles on IBM SP-2 and Intel Paragon distribut
ed-memory message-passing architectures show that this approach reduces the
execution times and improves the overall speedups, in addition, we extend
the base algorithm to work with file layout constraints and show how it is
useful for optimizing programs that consist of multiple loop nests.