ITA
ENG

Fault-tolerant processor arrays using additional bypass linking allocated by graph-node coloring

Authors

Tsuda, N

Citation

N. Tsuda, Fault-tolerant processor arrays using additional bypass linking allocated by graph-node coloring, IEEE COMPUT, 49(5), 2000, pp. 431-442

Citations number

Categorie Soggetti

Computer Science & Engineering

Journal title

IEEE TRANSACTIONS ON COMPUTERS

ISSN journal

00189340 → ACNP

Volume

Issue

Year of publication

2000

Pages

431 - 442

Database

ISI

SICI code

0018-9340(200005)49:5<431:FPAUAB>2.0.ZU;2-4

Abstract

An advanced spare-connection scheme for k-out-of-n redundancy called "gener alized additional bypass linking" is proposed for constructing fault-tolera nt massively parallel computers with series-connected, mesh-connected, or t ree-connected processing element (PE) arrays. This scheme uses bypass links with wired OR connections to selectively connect the primary PEs to a spar e PE in parallel. These bypass links are allocated to the primary PEs by no de-coloring of a graph with a minimum inter-node distance of three in order to minimize the number of bypass links (i.e., the chromatic number). The m ain advantage of this scheme is that it can be used for constructing variou s k-out-of-n configurations capable of enhanced PE-to-PE communication and broadcast while still achieving strong fault tolerance for these PEs and li nks. In particular, it enables the construction of optimal r-strongly-fault -tolerant configurations capable of direct k-out-of-n selections by providi ng r spare PEs and r extra connections per PE for any kind of array when no de-coloring with a distance of three is used. This simple spare-circuit str ucture enhances fault tolerance more than conventional schemes do. The node -coloring patterns were constructed using new node-coloring algorithms and the chromatic numbers were evaluated theoretically. Enhanced PE-to-PE commu nication and broadcast were achieved by using new fault-tolerant routing al gorithms based on the properties of the node-coloring patterns with four or five message transmission steps being optimal configurations with any size array.