ITA
ENG

COLLECTIVE COMMUNICATION IN WORMHOLE-ROUTED MASSIVELY-PARALLEL COMPUTERS

Authors

MCKINLEY PK TSAI YJ ROBINSON DF

Citation

Pk. Mckinley et al., COLLECTIVE COMMUNICATION IN WORMHOLE-ROUTED MASSIVELY-PARALLEL COMPUTERS, Computer, 28(12), 1995, pp. 39

Citations number

Categorie Soggetti

Computer Sciences","Computer Science Hardware & Architecture","Computer Science Software Graphycs Programming

Journal title

Computer → ACNP

ISSN journal

00189162

Volume

Issue

Year of publication

1995

Database

ISI

SICI code

0018-9162(1995)28:12<39:CCIWMC>2.0.ZU;2-C

Abstract

The supercomputer market is now dominated by parallel architectures, a mong which massively parallel computers (MPCs) are an important class of systems. The memory of an MPC is physically distributed among an en semble of computing nodes that communicate by sending data through a n etwork. Communication operations can be either point-to-point, with on e source and one destination, or collective, with more than two partic ipating processes. The design of collective communication operations d epends on the MPC's underlying network architecture. While there has b een little consensus on some aspects of communication architectures, s uch as network topology, a good deal of agreement exists regarding the most efficient way to switch messages through the network. Most MPCs use wormhole routing, in which each message is divided into small piec es that are pipelined through the network. Compared with the store-and -forward switching method used in early multicomputers, wormhole routi ng reduces the effect of path length on communication time. However, i n situations where multiple messages exist in the network concurrently , wormhole routing can exacerbate channel contention, which occurs whe n blocked messages hold some communication channels while waiting for others. Invoking a collective operation, which can involve many messag es, poses this situation. In recent years, many projects have addresse d the design of efficient collective communication algorithms for worm hole-routed systems. By exploiting the relative distance-insensitivity of wormhole routing, these new algorithms often differ fundamentally from their store-and-forward counterparts. This article examines softw are and hardware approaches to implementing collective communication o perations, illustrating several issues arising in this research area a nd describing the major classes of algorithms proposed to solve these problems.