EARLY PREDICTION OF MPP PERFORMANCE - THE SP2, T3D, AND PARAGON EXPERIENCES

Authors
Citation
Zw. Xu et K. Hwang, EARLY PREDICTION OF MPP PERFORMANCE - THE SP2, T3D, AND PARAGON EXPERIENCES, Parallel computing, 22(7), 1996, pp. 917-942
Citations number
29
Categorie Soggetti
Computer Sciences","Computer Science Theory & Methods
Journal title
ISSN journal
01678191
Volume
22
Issue
7
Year of publication
1996
Pages
917 - 942
Database
ISI
SICI code
0167-8191(1996)22:7<917:EPOMP->2.0.ZU;2-P
Abstract
The performance of Massively Parallel Processors (MPPs) is attributed to a large number of machine and program factors. Software development for MPP applications is often very costly. The high cost is partially caused by a lack of early prediction of MPP performance. The program development cycle may iterate many times before achieving the desired performance level. In this paper, we present an early prediction schem e we have developed at the University of Southern California for reduc ing the cost of application software development. Using workload analy sis and overhead estimation, our scheme optimizes the design of parall el algorithm before entering the tedious coding, debugging, and testin g cycle of the applications. The scheme is generally applied at user/p rogrammer level, not tied to any particular machine platform or any sp ecific software environment. We have tested the effectiveness of this early performance prediction scheme by running the MIT/STAP benchmark programs on a 400-node IBM SP2 system at the Maul High-Performance Com puting Center (MHPCC), on a 400-node Intel Paragon system at the San D iego Supercomputing Center (SDSC), and on a 128-node Cray T3D at the C ray Research Eagan Center in Wisconsin. Our prediction shows to be rat her accurate compared with the actual performance measured on these ma chines. We use the SP2 data to illustrate the early prediction scheme. The main contribution of this work lies in providing a systematic pro cedure to estimate the computational work-load, to determine the appli cation attributes, and to reveal the communication overhead in using t hese MPPs. These results can be applied to develop any MPP application s other than the STAP benchmarks by which this prediction scheme was d eveloped.