SOFTWARE DEPENDABILITY IN THE TANDEM GUARDIAN SYSTEM

Authors
Citation
Iw. Lee et Rk. Iyer, SOFTWARE DEPENDABILITY IN THE TANDEM GUARDIAN SYSTEM, IEEE transactions on software engineering, 21(5), 1995, pp. 455-467
Citations number
37
Categorie Soggetti
Computer Sciences","Engineering, Eletrical & Electronic","Computer Science Software Graphycs Programming
ISSN journal
00985589
Volume
21
Issue
5
Year of publication
1995
Pages
455 - 467
Database
ISI
SICI code
0098-5589(1995)21:5<455:SDITTG>2.0.ZU;2-C
Abstract
Based on extensive field failure data for Tandem's GUARDIAN operating system, this paper discusses evaluation of the dependability of operat ional software, Software faults considered are major defects that resu lt in processor failures and invoke backup processes to take over, The paper categorizes the underlying causes of software failures and eval uates the effectiveness of the process pair technique in tolerating so ftware faults, A model to describe the impact of software faults on th e reliability of an overall system is proposed, The model is used to e valuate the significance of key factors that determine software depend ability and to identify areas for improvement. An analysis of the data shows that about 77% of processor failures that are initially conside red due to software are confirmed as software problems, The analysis s hows that the use of process pairs to provide checkpointing and restar t (originally intended for tolerating hardware faults) allows the syst em to tolerate about 75% of reported software faults that result in pr ocessor failures, The loose coupling between processors, which results in the backup execution (the processor state and the sequence of even ts) being different from the original execution, is a major reason for the measured software fault tolerance, Over two-thirds (72%) of measu red software failures are recurrences of previously reported faults, M odeling, based on the data, shows that, in addition to reducing the nu mber of software faults, software dependability can be enhanced by red ucing the recurrence rate.