P. Rogers et Aj. Wellings, State restoration in Ada 95: a portable approach to supporting software fault tolerance, J SYST SOFT, 50(3), 2000, pp. 237-255
Studies indicate that techniques for tolerating hardware faults are so effe
ctive that software design errors are the leading cause of all faults encou
ntered. To handle these unanticipated software faults, two main approaches
have been proposed: N-version programming and recovery blocks. Both are bas
ed on the concept of design diversity: the assumption that different design
s will exhibit different faults (if any) for the same inputs and will, ther
efore, provide alternatives for each other. Both approaches have advantages
, but this paper focuses upon recovery blocks; specifically, the requiremen
t to save and restore application state. Judicious saving of state has been
described as "checkpointing" for over a decade. Using the object-oriented
features of the revised Ada language (Ada 95) - a language widely used in t
his domain we present three portable implementations of a checkpointing fac
ility and discuss the trade-offs offered by each. Results of the implementa
tion of these mechanisms are used to highlight both the strengths and weakn
esses of some of the object-oriented features of Ada. We then show a reusab
le implementation of recovery blocks illustrating the checkpointing schemes
. A performance analysis is made and measurements are presented in support
of the analysis. (C) 2000 Elsevier Science Inc. All rights reserved.