An efficient distributed fault-tolerant routing algorithm for the hypercube
is proposed based on the existence of a complete set of node-disjoint path
s between any two nodes. Node failure and repairs may occur dynamically pro
vided that the total number of faulty nodes at any time is less than the no
de-connectivity n of the n-cube. Each node maintains for each possible dest
ination which of the associated node-disjoint paths to use. When a message
is blocked by a node failure, the source node is warned and requested to sw
itch to a different node-disjoint path. The methods used to identify the pa
ths, to propagate node failure information to source nodes, and to switch f
rom one routing path to another incur little communication and computation
overhead. We show that if the faults occur reasonably apart in time, then a
ll messages will be routed on optimal or near optimal paths. In the unlikel
y case where many faults occur in a short period, the algorithm still deliv
ers all messages but via possibly longer paths. An extension of the obtaine
d algorithm to handle link failures in addition to node failures is discuss
ed. We also show how to adapt the algorithm to k-ary n-cube networks. The a
lgorithm can be similarly adapted to any interconnection network for which
there exists a simple characterization of node-disjoint paths between its n
odes.