DESIGN AND ANALYSIS OF AN OPTIMAL INSTRUCTION RETRY POLICY FOR TMR CONTROLLER COMPUTERS

Authors
Citation
Hb. Kim et Kg. Shin, DESIGN AND ANALYSIS OF AN OPTIMAL INSTRUCTION RETRY POLICY FOR TMR CONTROLLER COMPUTERS, I.E.E.E. transactions on computers, 45(11), 1996, pp. 1217-1225
Citations number
17
Categorie Soggetti
Computer Sciences","Engineering, Eletrical & Electronic","Computer Science Hardware & Architecture
ISSN journal
00189340
Volume
45
Issue
11
Year of publication
1996
Pages
1217 - 1225
Database
ISI
SICI code
0018-9340(1996)45:11<1217:DAAOAO>2.0.ZU;2-V
Abstract
An instruction-retry policy is proposed to enhance the fault-tolerance of triple modular redundant (TMR) controller computers by adding time redundancy to them. A TMR failure is said to occur if a TMR system fa ils to establish a majority among its modules' outputs due to multiple faulty modules or a faulty voter. Either multiple consecutive TMR fai lures the active period of which exceeds a certain time limit or the e xhaustion of spares as a result of frequent system reconfigurations ma y result in failure to meet the timing constraints of one or more task s, called the dynamic failure, during a given mission. An optimal inst ruction-retry period is derived by minimizing the probability of dynam ic failure upon detection of either a masked (by the TMR) error or a T MR failure. We also derive the minimum number of spares needed to keep below the pre-specified level the probability of dynamic failure for a given mission by using the derived optimal retry period.