K. Hashimoto et al., A new approach to fault-tolerant scheduling using task duplication in multiprocessor systems, J SYST SOFT, 53(2), 2000, pp. 159-171
In this paper we propose a new approach to fault-tolerant scheduling of par
allel programs in multiprocessor systems. It is well known that inter-proce
ssor communication makes serious effects on the performance of parallel pro
cessing, and that task duplication is an effective technique to reduce over
heads of communication. Though it was originally developed only for improvi
ng performance, we propose the use of this technique also for achieving fau
lt-tolerance. Based on this approach, we develop a scheduling algorithm for
tolerating a single processor failure. This algorithm duplicates all tasks
of a program and allocates them to processors so as to eliminate communica
tion delays as much as possible. The experimental results show that the obt
ained schedules can achieve fault-tolerance at the cost of small degree of
time redundancy. (C) 2000 Elsevier Science Inc. All rights reserved.