Consensus-based fault-tolerant total order multicast

Citation
U. Fritzke et al., Consensus-based fault-tolerant total order multicast, IEEE PARALL, 12(2), 2001, pp. 147-156
Citations number
23
Categorie Soggetti
Computer Science & Engineering
Journal title
IEEE TRANSACTIONS ON PARALLEL AND DISTRIBUTED SYSTEMS
ISSN journal
10459219 → ACNP
Volume
12
Issue
2
Year of publication
2001
Pages
147 - 156
Database
ISI
SICI code
1045-9219(200102)12:2<147:CFTOM>2.0.ZU;2-G
Abstract
While Total Order Broadcast (or Atomic Broadcast) primitives have received a lot of attention, this paper concentrates on Total Order Multicast to Mul tiple Groups in the context of asynchronous distributed systems in which pr ocesses may suffer crash failures. "Multicast to Multiple Groups" means tha t each message is sent to a subset of the process groups composing the syst em, distinct messages possibly having distinct destination groups. "Total O rder" means that all message deliveries must be totally ordered. This paper investigates a consensus-based approach to solve this problem and proposes a corresponding protocol to implement this multicast primitive. This proto col is based on two underlying building blocks, namely, Uniform Reliable Mu lticast and Uniform Consensus. Its design characteristics lie in the two fo llowing properties: The first one is a Minimality property, more precisely, only the sender of a message and processes of its destination groups have to participate in the total order multicast of the message. The second prop erty is a Locality property: No execution of a consensus has to involve pro cesses belonging to distinct groups (i.e., consensus is executed on a "per group" basis). This Locality property is particularly useful when one is in terested in using the Total Order Multicast primitive in large-scale distri buted systems. In addition to a correctness proof, an improvement that redu ces the cost of the protocol is also suggested.