Developing a low-cost high-quality software tool for dynamic fault-tree analysis

Citation
Jb. Dugan et al., Developing a low-cost high-quality software tool for dynamic fault-tree analysis, IEEE RELIAB, 49(1), 2000, pp. 49-59
Citations number
30
Categorie Soggetti
Eletrical & Eletronics Engineeing
Journal title
IEEE TRANSACTIONS ON RELIABILITY
ISSN journal
00189529 → ACNP
Volume
49
Issue
1
Year of publication
2000
Pages
49 - 59
Database
ISI
SICI code
0018-9529(200003)49:1<49:DALHST>2.0.ZU;2-1
Abstract
Sophisticated modeling and analysis methods are being developed in academic and industrial research labs for reliability engineering and other domains . The evaluation and evolution of such methods based on use in practice is critical to research progress, but few such methods see widespread use. A c ritical impediment to disseminating new methods is the inability to produce , at a reasonable cost, supporting software tools that have the usability and dependability characteristics that industrial users require, evolvability to accommodate software change as the underlying analysis meth ods are refined AR?) enhanced. The difficulty of software development thus emerges as a key impediment to advances in engineering modeling and analysis. Today, producing sophisticated software tools is costly and difficult, even for capable software developers. One problem is that when common design-me thods, such as object-oriented programming, are used to build such tools, t he results are often large, complex, and thus costly programs. Tools on the order of a million lines of code are typical, with much of the code devote d to tool interoperability, human-computer interface, other issues not directly related to modeling and analysis. Making matters worse, domain experts, such as reliability engineering resea rchers, often lack skills in modern software development, while software en gineers and researchers lack knowledge of the application domains. All too often the results of tool-development efforts today are thus costly, hard to use, not dependable, essentially unmaintainable. This paper presents an approach to tool development that attacks these prob lems. Progress requires synergistic, interdisciplinary collaborations betwe en application-domain and software-engineering researchers. We have pursued such an approach in developing Galileo: a fault tree modeling and analysis tool. These innovations are described in 2 dimensions 1) The Galileo core reliability modeling and analysts function. 2) Our work on software engineering for high-quality, low-cost modeling and analysis tools. In the reliability engineering domain, Galileo supports precise, modular, d ynamic fault-tree analysis using techniques developed primarily by Dugan an d her colleagues. This approach addresses the problem that a single analysi s technique seldom applies to an entire system. A good reliability engineer uses different techniques to analyze different parts of a system decomposing a complex model into smaller pieces, applying different analysis techniques to submodels, integrating partial results into a system-level result. Manually decomposing systems into parts, developing submodels, analyzing th em with different tools and techniques, and integrating the partial results is tedious and error prone at best. By contrast, Galileo- automatically detects independent(1) sub-trees; translates them into appropriate submodels based on Markov chains, Boolean decision diagrams, and other formalisms, analyzes the submodels; integrates the results. Galileo supports precise analysis while exploiting modularity for scalabili ty in solving problems that require time and space that is exponential in t he number of basic events in the worst-case. This software engineering approach centers on the component-based design te chniques of Sullivan and his colleagues. A key element of the approach is t he use of mass-market software packages as large components, viz, package-o riented programming. It achieves at low cost an effective human-computer interface, tool interoperability, considerable dependability for the function delivered. Low-cost means that the effort involves a small handful of graduate and und ergraduate students and faculty. Sullivan's mediator-based design approach is also used at several scales to support an integrated, multi-view environ ment in which it is possible to edit fault trees in either textual or graph ical form, while fostering dependability and evolvability. To help validate this modeling approach and to verify its implementation, both natural-lang uage and formal specifications are being developed for the fault-tree gates and their interactions. Galileo has been evaluated against commercially available fault tree analys is tools. The results highlight the need for fidelity in analysis. Testing two tools popular in the reliability engineering community revealed the sam e algorithmic error in both, despite their claimed ability to provide exact solutions. At the intersection of software and reliability engineering, th e redundancy inherent in the use of multiple analysis techniques in Galileo is used as an aid to testing this software. Galileo has been acquired by h undreds of sites. We are now building an enhanced version with NASA Langley Research Center.