A CASE FOR 2-LEVEL RECOVERY SCHEMES

Authors
Citation
Nh. Vaidya, A CASE FOR 2-LEVEL RECOVERY SCHEMES, I.E.E.E. transactions on computers, 47(6), 1998, pp. 656-666
Citations number
36
Categorie Soggetti
Computer Science Hardware & Architecture","Engineering, Eletrical & Electronic","Computer Science Hardware & Architecture
ISSN journal
00189340
Volume
47
Issue
6
Year of publication
1998
Pages
656 - 666
Database
ISI
SICI code
0018-9340(1998)47:6<656:ACF2RS>2.0.ZU;2-3
Abstract
Long-running applications are often subject to failures. Failures can result in significant loss of computation. Therefore, it is necessary to use a failure recovery scheme to minimize performance overhead in t he presence of failures. In this paper, we argue that it is often adva ntageous to use ''two-level'' recovery schemes. A two-level recovery s cheme tolerates the more probable failures with low performance overhe ad, while the less probable failures may possibly incur a higher overh ead. By minimizing overhead for the more frequently occurring failure scenarios, the two-level approach can achieve lower performance overhe ad (on average) as compared to existing recovery schemes. The paper de scribes two two-level recovery schemes. Performance analysis using a M arkov chain shows that, in practice, a two-level scheme can perform be tter than its ''one-level'' counterpart. While the conclusions of this paper are intuitive, the work on design of appropriate recovery schem es is lacking. The objective of this paper is to motivate research int o recovery schemes that can provide multiple levels of fault tolerance and achieve better performance than existing recovery schemes. The pa per presents an analytical approach for evaluating performance of two- level schemes and shows that such schemes are hard to optimize analyti cally.