A new approach to fault-tolerant scheduling using task duplication in multiprocessor systems

Citation
K. Hashimoto et al., A new approach to fault-tolerant scheduling using task duplication in multiprocessor systems, J SYST SOFT, 53(2), 2000, pp. 159-171
Citations number
21
Categorie Soggetti
Computer Science & Engineering
Journal title
JOURNAL OF SYSTEMS AND SOFTWARE
ISSN journal
01641212 → ACNP
Volume
53
Issue
2
Year of publication
2000
Pages
159 - 171
Database
ISI
SICI code
0164-1212(20000831)53:2<159:ANATFS>2.0.ZU;2-F
Abstract
In this paper we propose a new approach to fault-tolerant scheduling of par allel programs in multiprocessor systems. It is well known that inter-proce ssor communication makes serious effects on the performance of parallel pro cessing, and that task duplication is an effective technique to reduce over heads of communication. Though it was originally developed only for improvi ng performance, we propose the use of this technique also for achieving fau lt-tolerance. Based on this approach, we develop a scheduling algorithm for tolerating a single processor failure. This algorithm duplicates all tasks of a program and allocates them to processors so as to eliminate communica tion delays as much as possible. The experimental results show that the obt ained schedules can achieve fault-tolerance at the cost of small degree of time redundancy. (C) 2000 Elsevier Science Inc. All rights reserved.