Threshold-based mechanisms to discriminate transient from intermittent faults

Citation
A. Bondavalli et al., Threshold-based mechanisms to discriminate transient from intermittent faults, IEEE COMPUT, 49(3), 2000, pp. 230-245
Citations number
16
Categorie Soggetti
Computer Science & Engineering
Journal title
IEEE TRANSACTIONS ON COMPUTERS
ISSN journal
00189340 → ACNP
Volume
49
Issue
3
Year of publication
2000
Pages
230 - 245
Database
ISI
SICI code
0018-9340(200003)49:3<230:TMTDTF>2.0.ZU;2-B
Abstract
This paper presents a class of count-and-threshold mechanisms. collectively named rx-count, which are able to discriminate between transient faults an d intermittent faults in computing systems. For many years, commercial syst ems have been using transient fault discrimination via threshold-based tech niques. We aim to contribute to the utility of count-and-threshold schemes, by exploring their effects on the system. We adopt a mathematically define d structure, which is simple enough to analyze by standard tools, alpha-cou nt is equipped with internal parameters that can be tuned to suit environme ntal variables (such as transient fault rate, intermittent fault occurrence patterns). We carried out an extensive behavior analysis for two versions of the count-and-threshold scheme, assuming, first, exponentially distribut ed fault occurrencies and, then, more realistic fault patterns.