Kk. Goswami et al., DEPEND - A SIMULATION-BASED ENVIRONMENT FOR SYSTEM-LEVEL DEPENDABILITY ANALYSIS, I.E.E.E. transactions on computers, 46(1), 1997, pp. 60-74
The paper presents the rationale for a functional simulation tool, cal
led DEPEND, which provides an integrated design and fault injection en
vironment for system level dependability analysis. The paper discusses
the issues and problems of developing such a tool, and describes how
DEPEND tackles them. Techniques developed to simulate realistic fault
scenarios, reduce simulation time explosion, and handle the large faul
t model and component domain associated with system level analysis are
presented. Examples are used to motivate and illustrate the benefits
of this tool. To further illustrate its capabilities, DEPEND is used t
o simulate the Unix-based Tandem triple-modular-redundancy (TMR) based
prototype fault-tolerant system and evaluate how well it handles near
-coincident errors caused by correlated and latent faults. Issues such
as memory scrubbing, re-integration policies, and workload dependent
repair times, which affect how the system handles near-coincident erro
rs, are also evaluated. Unlike any other simulation-based dependabilit
y studies, the accuracy of the simulation model is validated by compar
ing the results of the simulations with measurements obtained from fau
lt injection experiments conducted on a production Tandem machine.