This paper presents novel compiler optimizations for reducing synchron
ization overhead in compiler-parallelized scientific codes. A hybrid p
rogramming model is employed to combine the flexibility of the fork-jo
in model with the precision and power of the single-program, multiple
data (SPMD) model. By exploiting compile-time computation partitions,
communication analysis can eliminate barrier synchronization or replac
e it with less expensive forms of synchronization. We show computation
partitions and data communication can be represented as systems of sy
mbolic linear inequalities for high flexibility and precision. These o
ptimizations has been implemented in the Stanford SUIF compiler. We ex
tensively evaluate their performance using standard benchmark suites.
Experimental results show barrier synchronization is reduced 29% on av
erage and by several orders of magnitude for certain programs.