COMPILING ARRAY EXPRESSIONS FOR EFFICIENT EXECUTION ON DISTRIBUTED-MEMORY MACHINES

Citation
Sks. Gupta et al., COMPILING ARRAY EXPRESSIONS FOR EFFICIENT EXECUTION ON DISTRIBUTED-MEMORY MACHINES, Journal of parallel and distributed computing, 32(2), 1996, pp. 155-172
Citations number
21
Categorie Soggetti
Computer Sciences","Computer Science Theory & Methods
ISSN journal
07437315
Volume
32
Issue
2
Year of publication
1996
Pages
155 - 172
Database
ISI
SICI code
0743-7315(1996)32:2<155:CAEFEE>2.0.ZU;2-C
Abstract
Array statements are often used to express data-parallelism in scienti fic languages such as Fortran 90 and High Performance Fortran. In comp iling array statements for a distributed-memory machine, efficient gen eration of communication sets and local index sets is important. We sh ow that for arrays distributed block-cyclically on multiple processors , the local memory access sequence and communication sets can be effic iently enumerated as closed forms using regular sections. First, close d form solutions are presented for arrays that are distributed using b lock or cyclic distributions. These closed forms are then used with a virtual processor approach to give an efficient solution for arrays wi th block-cyclic distributions. This approach is based on viewing a blo ck-cyclic distribution as a block (or cyclic) distribution on a set of virtual processors, which are cyclically (or block-wise) mapped to ph ysical processors. These views are referred to as virtual-block or vir tual-cyclic views, depending on whether a block or cyclic distribution of the array on the virtual processors is used. The virtual processor approach permits different schemes based on the combination of the vi rtual processor views chosen for the different arrays involved in an a rray statement. These virtualization schemes have different indexing o verhead. We present a strategy for identifying the virtualization sche me which will have the best performance. Performance results on a Gray T3D system are presented for hand-compiled code for array assignments . These results show that using the virtual processor approach, effici ent code can be generated for execution of array statements involving block-cyclically distributed arrays. (C) 1996 Academic Press, Inc.