K. Kanoun et M. Ortalo-borrel, Fault-tolerant system dependability - Explicit modeling of hardware and software component-interactions, IEEE RELIAB, 49(4), 2000, pp. 363-376
This paper presents a framework for modeling the dependability of hardware
and software fault-tolerant systems, taking into account explicitly the dep
endence among the components. These dependencies can result from: a) functi
onal or structural interactions between the components or b) interactions d
ue to global system reconfiguration and maintenance strategies. Modeling is
based on GSPN (generalized stochastic Petri net). The modeling approach is
modular: the behavior of each component and each interaction is represente
d by its own GSPN, while the system model is obtained by composition of the
se GSPN, Composition rules are defined and formalized through clear identif
ication of the interfaces between the component and interaction nets. In ad
dition to modularity, the formalism brings flexibility and re-usability, th
ereby allowing easy sensitivity analysis with respect to the assumptions th
at could be made about the behavior of the components and the resulting int
eractions.
This approach has; been successfully applied to select new architectures fo
r the French Air Traffic Control system, based among other things, on avail
ability evaluation. This paper illustrates it on a simple representative ex
ample, including all the types of the identified dependencies: the duplex s
ystem. Modeling of this system showed the strong dependence between compone
nts. For example: the activation of a temporary hardware fault can propagat
e an error to the hosted software component, which in turn can propagate to
other components communicating with it (without being necessarily on the s
ame computer). Thus the activation of a hardware temporary fault can lead t
o the restart of one or more software components. Even if this has been obs
erved on real systems, it has not been modeled explicitly in previous work.
This paper shows how the modification of one or several assumptions can be
performed without modifying all GSPN, considering two repair policies and
two switching policies (with or without manual switch).
The main advantage of this modeling approach, based on considering explicit
ly the interactions, lies in its efficiency for modeling several alternativ
es for the same system. These alternatives can differ by their composition
or the organization or by the fault-tolerance and maintenance strategies. O
ne can clearly identify from the beginning the components and interactions
that are specific and those that are common to all alternatives, The common
GSPN are thus developed and validated only once.