Jh. Kim et Nh. Vaidya, SINGLE FAULT-TOLERANT DISTRIBUTED SHARED-MEMORY USING COMPETITIVE UPDATE, Microprocessors and microsystems, 21(3), 1997, pp. 183-196
In this paper, we propose a single fault-tolerant distributed shared m
emory (DSM) that uses a competitive update protocol. In this update pr
otocol, multiple copies of each page may be maintained at different no
des. However, it is also possible for a page to exist in only one node
, as some copies of the page may be invalidated. We propose an impleme
ntation that makes the competitive update protocol recoverable from a
single node failure, by guaranteeing that at least two copies of each
page exist. We also present a mechanism that maintains consistency bet
ween shared data and process local state after recovery, by updating s
hared data and process local state atomically. The paper presents eval
uation of the recoverable DSM using an implementation. It is shown tha
t the overhead of making the DSM recoverable measured in terms of the
number of messages and the amount of data transferred is small in many
applications. (C) 1997 Elsevier Science B.V.