The authors show how synchronization time, which greatly affects the t
ime taken by a communication step, can be reduced by increasing conten
tion. Their experience indicates that, despite improvements in interpr
ocessor communication hardware, parallel algorithm designers still nee
d to take topology into account to obtain high performance.