Bh. Lim et al., PERFORMANCE IMPLICATIONS OF COMMUNICATION MECHANISMS IN ALL-SOFTWARE GLOBAL ADDRESS SPACE SYSTEMS, ACM SIGPLAN NOTICES, 32(7), 1997, pp. 230-239
Global addressing of shared data simplifies parallel programming and c
omplements message passing models commonly found in distributed memory
machines. A number of programming systems have been designed that syn
thesize global addressing purely in software on such machines. These s
ystems provide a number of communication mechanisms to mitigate the ef
fect of high communication latencies and overheads. This study compare
s the mechanisms in two representative all-software systems: CRL and S
plit-C. CRL uses region-based caching while Split-C uses split-phase a
nd push-based data transfers for optimizing communication performance.
Both systems take advantage of bulk data transfers. By implementing a
set of parallel applications in both CRL and Split-C, and running the
m on the IBM SP2, Meiko CS-2 and two simulated architectures, we find
that split-phase and push-based bulk data transfers are essential for
good performance. Region-based caching benefits applications with irre
gular structure and with sufficient temporal locality, especially unde
r high communication latencies. However, caching also hurts performanc
e when there is insufficient data reuse or when the size of caching gr
anularity is mismatched with the communication granularity. We find th
e programming complexity of the communication mechanisms in both langu
ages to be comparable. Based on our results, we recommend that an idea
l system intended to support diverse applications on parallel platform
s should incorporate the communication mechanisms in CRL and Split-C.