Pt. Gaughan et al., DISTRIBUTED, DEADLOCK-FREE ROUTING IN FAULTY, PIPELINED, DIRECT INTERCONNECTION NETWORKS, I.E.E.E. transactions on computers, 45(6), 1996, pp. 651-665
This paper focuses on designing high performance pipelined networks th
at can operate in the presence of dynamic component failures. A genera
l, rigorous framework for deadlock-free communication in faulty, pipel
ined networks is developed. A mechanism is also proposed for recoverin
g from dynamic link and node failures. The recovery mechanism 1) is fu
lly distritbuted, 2) does not require timeouts, 3) prevents fault-indu
ced deadlock, and 4) is integrated into the virtual channel flow contr
ol mechanisms. This recovery mechanism is used to develop a new pipeli
ned communication mechanism-acknowledged pipelined circuit-switching (
APCS). This mechanism supports existing routing protocols [19] that ca
n tolerate a maximal number of static link failures, i.e., one less th
an the number of ports on a node. An implementation of a novel router
architecture is described and the results of detailed flit level simul
ations are presented. Finally, the proposed recovery mechanism is show
n to be applicable to existing adaptive wormhole routing protocols whi
ch are prone to deadlock in the presence of dynamic faults.