We present compiler analyses and optimizations for explicitly parallel
programs that communicate through a shared address space. Any type of
code motion on explicitly parallel programs requires a new kind of an
alysis to ensure that operations reordered on one processor cannot be
observed by another. The analysis, based on work by Shasha and Snir, c
hecks for cycles among interfering accesses. We improve the accuracy o
f their analysis by using additional information from post-wait synchr
onization, barriers, and locks. We demonstrate the use of this analysi
s by optimizing remote access on distributed memory machines. The opti
misations include message pipelining, to allow multiple outstanding re
mote memory operations, conversion of two-way to one-way communication
, and elimination of communication through data re-use. The performanc
e improvements are as high as 20-35% for programs running on a CM-5 mu
ltiprocessor using the Split-C language as a global address layer.