RATIONALE AND STRATEGY FOR A 21ST-CENTURY SCIENTIFIC COMPUTING ARCHITECTURE - THE CASE FOR USING COMMERCIAL SYMMETRICAL MULTIPROCESSORS AS SUPERCOMPUTERS

Authors
Citation
We. Johnston, RATIONALE AND STRATEGY FOR A 21ST-CENTURY SCIENTIFIC COMPUTING ARCHITECTURE - THE CASE FOR USING COMMERCIAL SYMMETRICAL MULTIPROCESSORS AS SUPERCOMPUTERS, International journal of high speed computing, 9(3), 1997, pp. 191-222
Citations number
15
ISSN journal
01290533
Volume
9
Issue
3
Year of publication
1997
Pages
191 - 222
Database
ISI
SICI code
0129-0533(1997)9:3<191:RASFA2>2.0.ZU;2-D
Abstract
In this paper we argue that the next generation of supercomputers will be based on tight-knit clusters of symmetric multiprocessor systems i n order to: (i) provide higher capacity at lower cost; (ii) enable eas y future expansion, and (iii) ease the development of computational sc ience applications. This strategy involves recognizing that the curren t vector supercomputer user community divides (roughly) into two group s, each of which will benefit from this approach: One, the ''capacity' ' users (who tend to run production codes aimed at solving the science problems of today) will get better throughput than they do today by m oving to large symmetric multiprocessor systems (SMPs), and a second g roup, the ''capability'' users (who tend to be developing new computat ional science techniques) will invest the time needed to get high perf ormance from cluster-based parallel systems. In addition to the techno logy-based arguments for the strategy, we believe that it also support s a vision for a revitalization of scientific computing. This vision i s that an architecture based on commodity components and computer scie nce innovation will: (i) enable very scalable high performance computi ng to address the high-end computational science requirements; (ii) pr ovide better throughput and a more productive code development environ ment for production supercomputing; (iii) provide a path to integratio n with the laboratory and experimental sciences, and (iv) be the basis of an on-going collaboration between the scientific community, the co mputing industry, and the research computer science community in order to provide a computing environment compatible with production codes a nd dynamically increasing in both hardware and software capability and capacity. We put forward the thesis that the current level of hardwar e performance and sophistication of the software environment found in commercial symmetric multiprocessor (SMP) systems, together with advan ces in distributed systems architectures, make clusters of SMPs one of the highest-performance, most cost-effective approaches to computing available today. The current capacity users of the C90-like system wil l be served in such an environment by having more of several critical resources than the current environment provides: much more CPU time pe r unit of real time, larger memory per node and much larger memory per cluster; and the capability users are served by an MPP-like performan ce and an architecture that enables continuous growth into the future. In addition to these primary arguments, secondary advantages of SMP c lusters include: the ability to replicate this sort of system in small er units to provide identical computing environments at the home sites and laboratories of scientific users; the future potential for using the global Internet for interconnecting large clusters at a central fa cility with smaller clusters at other sites to form a very high capabi lity system; and a rapidly growing base of supporting commercial softw are. The arguments made to support this thesis are as follows: (1) Wor kstation vendors are increasingly turning their attention to paralleli sm in order to run increasingly complex software in their commercial p roduct lines. The pace of development by the ''workstation'' manufactu rers due to their very-large investment in research and development fo r hardware and software is so rapid that the special-purpose research aimed at just the high-performance market is no longer able to produce significant advantages over the mass-market products. We illustrate t his trend and analyze its impact on the current performance of SMPs re lative to vector supercomputers. (2) Several factors also suggest that ''clusters'' of SMPs will shortly outperform traditional MPPs for rea sons similar to those mentioned above. The mass-produced network archi tectures and components being used to interconnect SMP clusters are ex periencing technology and capability growth trends similar to commodit y computing systems. This is due to the economic drivers of the mergin g of computing and telecommunications technology, and the greatly incr eased demand for high bandwidth data communication. Very-high-speed ge neral-purpose networks are now being produced for a large market, and the technology is experiencing the same kinds of rapid advances as wor kstation processor technology. The engineering required to build MPPs from special-purpose networks that are integrated in special ways with commercial microprocessors is costly and requires long engineering le ad times. This results in delivered MPPs with less capable processors than are being delivered in workstations at the same time. (3) Commerc ial software now exists that provides integrated, MPP-style code devel opment and system management-for clusters of SMPs, and software archit ectures and components that will provide even more homogeneous views o f clusters of SMPs are now emerging from several academic research gro ups. We propose that the next-generation scientific supercomputer cent er be built from clusters of SMPs, and suggest a strategy for an initi al 50 Gflop configuration and incremental increases thereafter to reac h a teraflop by just after the turn of the century. While this cluster uses what is called ''network of workstations'' technology, the indiv idual nodes are, in and of themselves, powerful systems that typically have several gigaflops of CPU and several gigabytes of memory. The ri sks of this approach are analyzed, and found to be similar to those of MPPs. That is, the risks are primarily in software issues that are si milar for SMPs and MPPs: namely, in the provision of a homogenous view of a distributed memory system. The argument is made that the capacit y of today's large SMPs, taken together with already existing distribu ted systems software, will provide a versatile and powerful computatio nal science environment. We also address the issues of application ava ilability and code conversion to this new environment even if the homo geneous cluster software environment does not mature as quickly as exp ected. The throughput of the proposed SMP cluster architecture is subs tantial. The job mix is more easily load balanced because of the subst antially greater memory size of the proposed cluster implementation as compared to a typical C90. The larger memory allows more jobs to be i n the active schedule queue (in memory waiting to execute), and the la rger ''local'' disk capacity of the cluster allows more data