A fault-tolerance model for multiprocessor real-time systems

Citation
St. Cheng et al., A fault-tolerance model for multiprocessor real-time systems, J COMPUT SY, 61(3), 2000, pp. 457-477
Citations number
25
Categorie Soggetti
Computer Science & Engineering
Journal title
JOURNAL OF COMPUTER AND SYSTEM SCIENCES
ISSN journal
00220000 → ACNP
Volume
61
Issue
3
Year of publication
2000
Pages
457 - 477
Database
ISI
SICI code
0022-0000(200012)61:3<457:AFMFMR>2.0.ZU;2-Y
Abstract
System reliability is an important aspect of real-time systems, because the result of a real-time application may be valid only if the application fun ctions correctly and its timing constraints are satisfied. There are two ki nds of faults, hardware and software faults, and the paper considers hardwa re transient faults. Full replication or full hardware redundancy can achie ve a high degree of reliability; however, it wastes lots of resources. For most real-time systems, such schemes might not be available and hence relia bility estimation becomes essential. We propose an analytic model for syste m reliability estimation based on the Markov chain and investigate the accu racy of the estimated reliability. The results show that the proposed model obtains good estimation in various simulated real-time systems. (C) 2000 A cademic Press.