ITA
ENG

Resource scaling effects on MPP performance: The STAP benchmark implications

Authors

Hwang, K Wang, CM Wang, CL Xu, ZW

Citation

K. Hwang et al., Resource scaling effects on MPP performance: The STAP benchmark implications, IEEE PARALL, 10(5), 1999, pp. 509-527

Citations number

Categorie Soggetti

Computer Science & Engineering

Journal title

IEEE TRANSACTIONS ON PARALLEL AND DISTRIBUTED SYSTEMS

ISSN journal

10459219 → ACNP

Volume

Issue

Year of publication

1999

Pages

509 - 527

Database

ISI

SICI code

1045-9219(199905)10:5<509:RSEOMP>2.0.ZU;2-G

Abstract

Presently, massively parallel processors (MPPs) are available only in a few commercial models. A sequence of three ASCI Teraflops MPPs has appeared be fore the new millenium. This paper evaluates six MPP systems through STAP b enchmark experiments. The STAP is a radar signal processing benchmark which exploits regularly structured SPMD data parallelism. We reveal the resourc e scaling effects on MPP performance along orthogonal dimensions of machine size, processor speed, memory capacity, messaging latency, and network ban dwidth. We show how to achieve balanced resources Scaling against enlarged workload (problem size). Among three commercial MPPs, the IBM SP2 shows the highest speed and efficiency, attributed to its well-designed network with middleware support for single system image. The Gray T3D demonstrates a hi gh network bandwidth with a good NUMA memory hierarchy. The Intel Paragon t rails far behind due to slow processors used and excessive latency experien ced in passing messages. Our analysis projects the lowest STAP speed on the ASCI Red, compared with the projected speed of two ASCI Blue machines. Thi s is attributed to slow processors used in ASCI Red and the mismatch betwee n its hardware and software. The Blue Pacific shows the highest potential t o deliver scalable performance up to thousands of nodes. The Blue Mountain is designed to have the highest network bandwidth. Our results suggest a li mit on the scalability of the distributed shared-memory (DSM) architecture adopted in Blue Mountain. The seating model offers a quantitative method to match resource scaling with problem scaling to yield a truly scalable perf ormance. The model helps MPP designers optimize the processors, memory, net work, and I/O subsystems of an MPP. For MPP users, the scaling results can be applied to partition a large workload for SPMD execution or to minimize the software overhead in collective communication or remote memory update o perations. Finally, our scaling model is assessed to evaluate MPPs with ben chmarks other than STAP.