Lr. Matheson et Re. Tarjan, PARALLELISM IN MULTIGRID METHODS - HOW MUCH IS TOO MUCH, International journal of parallel programming, 24(5), 1996, pp. 397-432
Citations number
38
Categorie Soggetti
Computer Sciences","Computer Science Theory & Methods
Multigrid methods are powerful techniques to accelerate the solution o
f computationally-intensive problems arising in a broad range of appli
cations. Used in conjunction with iterative processes for solving part
ial differential equations, multigrid methods speed up iterative metho
ds by moving the computation from the original mesh covering the probl
em domain through a series of coarser meshes. But this hierarchical st
ructure leaves domain-parallel versions of the standard multigrid algo
rithms with a deficiency of parallelism on coarser grids. To compensat
e, several parallel multigrid strategies with more parallelism, but al
so more work, have been designed. We examine these parallel strategies
and compare them to simpler standard algorithms to try to determine w
hich techniques are more efficient and practical. We consider three pa
rallel multigrid strategies: (1) domain-parallel versions of the stand
ard V-cycle and F-cycle algorithms; (2) a multiple coarse grid algorit
hm, proposed by Fredrickson and McBryan, which generates several coars
e grids for each fine grid; and (3) two concurrent methods, the Chan-T
uminaro algorithm and the Gannonvan Rosendale algorithm, which allow c
omputation on all grids simultaneously. We study an elliptic model pro
blem on simple domains, discretized with finite difference techniques
on block-structured meshes in two or three dimensions with up to 10(6)
or 10(9) points, respectively. We analyze performance using three mod
els of parallel computation: the PRAM and two bridging models. The bri
dging models reflect the salient characteristics of two kinds of paral
lel computers: SIMD fine-grain computers, which contain a large number
of small (bit-serial) processors, and SPMD medium-grain computers, wh
ich have a more modest number of powerful (single chip) processors. Ou
r analysis suggests that the standard algorithms are substantially mor
e efficient than algorithms utilizing either parallel strategy. Both p
arallel strategies need too much extra work to compensate For their ex
tra parallelism. They require a highly impractical number of processor
s to be competitive with simpler, standard algorithms. The analysis al
so suggests that the F-cycle, with the appropriate optimization techni
ques, is more efficient than the V-cycle under a broad range of proble
m, implementation. and machine characteristics, despite the fact that
it exhibits even less parallelism than the V-cycle.