ITA
ENG

Multitolerance in distributed reset

Authors

Kulkarni, SS Arora, A

Citation

Ss. Kulkarni et A. Arora, Multitolerance in distributed reset, CH J THEOR, (4), 1998, pp. 1-46

Citations number

Categorie Soggetti

Computer Science & Engineering

Journal title

CHICAGO JOURNAL OF THEORETICAL COMPUTER SCIENCE

ISSN journal

10730486 → ACNP

Issue

Year of publication

1998

Pages

1 - 46

Database

ISI

SICI code

1073-0486(199812):4<1:MIDR>2.0.ZU;2-U

Abstract

A reset of a distributed system is safe if it does not complete prematurely ," i.e., without having reset some process in the system. Safe resets are p ossible in the presence of certain faults, such as process fail-stops and r epairs, but are not always possible in the presence of more general faults, such as arbitrary transients. In this paper, we design a bounded-memory di stributed-reset program that possesses two tolerances: (1) in the presence of fail-stops and repairs, it always executes resets safely, and (2) in the presence of a finite number of transient faults, it eventually executes re sets safely. Designing this multitolerance in the reset program introduces the novel concern of designing a safety detector that is itself multitolera nt. A broad application of our multitolerant safety detector is to make any total program likewise multitolerant.