A COMPARISON OF COARSE AND FINE-GRAIN PARALLELIZATION STRATEGIES FOR THE SIMPLE PRESSURE CORRECTION ALGORITHM

Authors
Citation
Aj. Lewis et Ad. Brent, A COMPARISON OF COARSE AND FINE-GRAIN PARALLELIZATION STRATEGIES FOR THE SIMPLE PRESSURE CORRECTION ALGORITHM, International journal for numerical methods in fluids, 16(10), 1993, pp. 891-914
Citations number
15
Categorie Soggetti
Mathematical Method, Physical Science","Phsycs, Fluid & Plasmas",Mechanics
ISSN journal
02712091
Volume
16
Issue
10
Year of publication
1993
Pages
891 - 914
Database
ISI
SICI code
0271-2091(1993)16:10<891:ACOCAF>2.0.ZU;2-P
Abstract
The primary aim of this work was to determine the simplest and most ef fective parallelization strategy for control-volume-based codes solvin g industrial problems. It has been found that for certain classes of p roblems, the coarse-grain functional decomposition strategy, largely i gnored due to its limited scaling capability, offers the potential for significant execution speed-ups while maintaining the inherent struct ure of traditional serial algorithms. Functional decomposition require s only minor modification of the existing serial code to implement and , hence, code portability across both concurrent and serial computers is maintained, Fine-grain parallelization strategies at the 'DO loop' level are also easy to implement and largely preserve code portability . Both coarse-grain functional decomposition and fine-grain loop-level parallelization strategies for the SIMPLE pressure correction algorit hm are demonstrated on a Silicon Graphics 4D280S eight CPU shared memo ry computer system for a highly coupled, transient two-dimensional sim ulation involving melting of a metal in the presence of thermal-buoyan cy-driven laminar convection. Problems requiring the solution of a lar ger number of transport equations were simulated by including further scalar variables in the calculation. While resulting in slight degrada tion of the convergence rate, the functional decomposition strategy ex hibited higher parallel efficiencies and yielded greater speed-ups rel ative to the original serial code. Initially, this strategy showed a s ignificant degradation in convergence rate due to an inconsistency in the parallel solution of the pressure correction equation. After corre cting for this inconsistency, the maximum speed-up for 16 dependent va riables was a factor of 5.28 with eight processors, representing a par allel efficiency of 67%. Peak efficiency of 76% was achieved using fiv e processors to solve for 10 dependent variables.