FAULT-TOLERANT DESIGN STRATEGIES FOR HIGH-RELIABILITY AND SAFETY

Citation
Nh. Vaidya et Dk. Pradhan, FAULT-TOLERANT DESIGN STRATEGIES FOR HIGH-RELIABILITY AND SAFETY, I.E.E.E. transactions on computers, 42(10), 1993, pp. 1195-1206
Citations number
19
Categorie Soggetti
Computer Sciences","Engineering, Eletrical & Electronic","Computer Applications & Cybernetics
ISSN journal
00189340
Volume
42
Issue
10
Year of publication
1993
Pages
1195 - 1206
Database
ISI
SICI code
0018-9340(1993)42:10<1195:FDSFHA>2.0.ZU;2-L
Abstract
Critical applications require systems with high reliability and safety . Reliability is the probability that the system produces correct outp ut. Safety is defined as the probability that the system output is eit her correct, or the error in the output is detectable (the assumption being that the system is safe when the error is detected). Systems wit h high safety ensure that the probability of undetected errors is low. In this paper, several fundamental results related to reliability and safety are analyzed. Modular redundant systems consisting of multiple identical modules and an arbiter are considered. It is shown that for a given level of redundancy, a large number of implementation alterna tives exist with varying degree of reliability and safety. Strategies are formulated that achieve a maximal combination of reliability and s afety. The effect of increasing the number of modules on system reliab ility and safety is analyzed. It is shown that when one considers safe ty in addition to reliability, it does not necessarily help to simply add modules to the system. Specifically, increasing the number of modu les by just one does not always improve both reliability and safety. T o improve reliability and safety simultaneously, at least two addition al modules are required when the outputs of the individual modules do not have any redundant information (e.g., coding for error detection). However, it is shown that if the modules themselves have built-in err or detection capability, addition of just one module may be sufficient to improve both reliability and safety.