A unified framework for optimizing locality, parallelism, and communication in out-of-core computations

Citation
M. Kandemir et al., A unified framework for optimizing locality, parallelism, and communication in out-of-core computations, IEEE PARALL, 11(7), 2000, pp. 648-668
Citations number
45
Categorie Soggetti
Computer Science & Engineering
Journal title
IEEE TRANSACTIONS ON PARALLEL AND DISTRIBUTED SYSTEMS
ISSN journal
10459219 → ACNP
Volume
11
Issue
7
Year of publication
2000
Pages
648 - 668
Database
ISI
SICI code
1045-9219(200007)11:7<648:AUFFOL>2.0.ZU;2-U
Abstract
This paper presents a unified framework that optimizes out-of-core programs by exploiting locality and parallelism, and reducing communication overhea d. For out-of-core problems where the data set sizes far exceed the size of the available in-core memory, it is particularly important to exploit the memory hierarchy by optimizing the I/O accesses. We present algorithms that consider both iteration space (loop) and data space (file layout) transfor mations in a unified framework, We show that the performance of an out-of-c ore loop nest containing references to out-of-core arrays can be improved b y using a suitable combination of file layout choices and loop restructurin g transformations. Our approach considers array references one-by-one and a ttempts to optimize each reference for parallelism and locality. When there are references for which parallelism optimizations do not work, communicat ion is vectorized so that data transfer can be performed before the innermo st loop. Results from hand-compiles on IBM SP-2 and Intel Paragon distribut ed-memory message-passing architectures show that this approach reduces the execution times and improves the overall speedups, in addition, we extend the base algorithm to work with file layout constraints and show how it is useful for optimizing programs that consist of multiple loop nests.