An adaptive algorithm for tolerating value faults and crash failures

Citation
Ys. Ren et al., An adaptive algorithm for tolerating value faults and crash failures, IEEE PARALL, 12(2), 2001, pp. 173-192
Citations number
27
Categorie Soggetti
Computer Science & Engineering
Journal title
IEEE TRANSACTIONS ON PARALLEL AND DISTRIBUTED SYSTEMS
ISSN journal
10459219 → ACNP
Volume
12
Issue
2
Year of publication
2001
Pages
173 - 192
Database
ISI
SICI code
1045-9219(200102)12:2<173:AAAFTV>2.0.ZU;2-3
Abstract
The AQUA architecture provides adaptive fault tolerance to CORBA applicatio ns by replicating objects and providing a high-level method that an applica tion can use to specify its desired level of dependability. This paper pres ents the algorithms that AQUA uses. when an application's dependability req uirements can change at runtime, to tolerate both value faults in applicati ons and crash failures simultaneously. In particular, we provide an active replication communication scheme that maintains data consistency among repl icas, detects crash failures, collates the messages generated by replicated objects. and delivers the result of each vote. We also present an adaptive majority voting algorithm that enables the correct ongoing vote while both the number of replicas and the majority size dynamically change. Together, these two algorithms form the basis of the mechanism for tolerating and re covering from value faults and crash failures in AQuA.