GROUP-TO-GROUP COMMUNICATIONS FOR FAULT-TOLERANCE IN DISTRIBUTED SYSTEMS

Citation
H. Higaki et T. Soneoka, GROUP-TO-GROUP COMMUNICATIONS FOR FAULT-TOLERANCE IN DISTRIBUTED SYSTEMS, IEICE transactions on information and systems, E76D(11), 1993, pp. 1348-1357
Citations number
NO
Categorie Soggetti
Computer Applications & Cybernetics
ISSN journal
09168532
Volume
E76D
Issue
11
Year of publication
1993
Pages
1348 - 1357
Database
ISI
SICI code
0916-8532(1993)E76D:11<1348:GCFFID>2.0.ZU;2-M
Abstract
This paper proposes a group-to-group communications algorithm that can extend the range of distributed systems where we can achieve active r eplication fault-tolerance to partner model distributed systems, in wh ich all processes communicate with each other on an equal footing. Act ive replication approach, in which all replicated processes are active , can achieve fault-tolerance with low overhead because checkpoint set ting and rollback are not required for recovery from process failure. This algorithm guarantees that each replicated process in a process gr oup has the same execution history and that communications between pro cess groups keeps consistency even in the presence of process failure and message loss. The number of control messages that must be transmit ted between processes for a communication between process groups is on ly a linear order of the number of replicated processes in each proces s group. Furthermore, this algorithm reduces the overhead for reconfig uration of a process group by keeping process failure and recovery inf ormation local to each process group.