Ym. Wang et al., CLASSIFYING AND ALLEVIATING THE COMMUNICATION OVERHEADS IN MATRIX COMPUTATIONS ON LARGE-SCALE NUMA MULTIPROCESSORS, The Journal of systems and software, 44(1), 1998, pp. 17-29
Citations number
17
Categorie Soggetti
Computer Science Theory & Methods","Computer Science Software Graphycs Programming","Computer Science Theory & Methods","Computer Science Software Graphycs Programming
Large-scale, shared-memory multiprocessors have non-uniform memory acc
ess (NUMA) costs. The high communication cost dominates the source of
matrix computations' execution. Memory contention and remote memory ac
cess are two major communication overheads on large-scale NUMA multipr
ocessors. However, previous experiments and discussions focus either o
n reducing the number of remote memory accesses or on alleviating memo
ry contention overhead. In this paper, we propose a simple but effecti
ve processor allocation policy, called rectangular processor allocatio
n, to alleviate both overheads at the same time. The policy divides th
e matrix elements into a certain number of rectangular blocks, and ass
igns each processor to compute the results of one rectangular block. T
his methodology may reduce a lot of unnecessary memory accesses to the
memory modules. After running many matrix computations under a realis
tic memory system simulator, we confirmed that at least one-fourth of
the communication overhead map be reduced. Therefore, we conclude that
rectangular processor allocation policy performs better than other po
pular policies, and that the combination of rectangular processor allo
cation policy with software interleaving data allocation policy is a b
etter choice to alleviate communication overhead. (C) 1998 Elsevier Sc
ience Inc. All rights reserved.