GENERATING A FAULT-TOLERANT GLOBAL CLOCK USING HIGH-SPEED CONTROL SIGNALS FOR THE METANET ARCHITECTURE

Authors
Citation
Y. Ofek, GENERATING A FAULT-TOLERANT GLOBAL CLOCK USING HIGH-SPEED CONTROL SIGNALS FOR THE METANET ARCHITECTURE, IEEE transactions on communications, 42(5), 1994, pp. 2179-2188
Citations number
18
Categorie Soggetti
Telecommunications,"Engineering, Eletrical & Electronic
ISSN journal
00906778
Volume
42
Issue
5
Year of publication
1994
Pages
2179 - 2188
Database
ISI
SICI code
0090-6778(1994)42:5<2179:GAFGCU>2.0.ZU;2-W
Abstract
This work describes a new technique, based on exchanging control signa ls between neighboring nodes, for constructing a stable and fault-tole rant global clock in a distributed system with an arbitrary topology. It is shown that it is possible to construct a global clock reference with time step that is much smaller than the propagation delay over th e network's links. The synchronization algorithm ensures that the glob al clock ''tick'' has a stable periodicity, and therefore, it is possi ble to tolerate failures of links and clocks that operate faster and/o r slower than nominally specified, as well as hard failures. The appro ach taken in this work is to generate a global clock from the ensemble of the local transmission clocks and not to directly synchronize thes e high-speed clocks. The steady-state algorithm, which generates the g lobal clock, is executed in hardware by the network interface of each node. At the network interface, it is possible to measure accurately t he propagation delay between neighboring nodes with a small error or u ncertainty and thereby to achieve global synchronization that is propo rtional to these error measurements. It is shown that the local dock d rift (or rate uncertainty) has only a secondary effect on the maximum global clock rate. The synchronization algorithm can tolerate any phys ical failure. It will continue to operate correctly on any connected s egment of the network, i.e., it can tolerate any number of link and no de failures, as long as the network remains connected. Furthermore, th e algorithm can tolerate failures of the following types: i) fast and slow clocks can be detected and isolated from the algorithm, ii) chang es in the value of link delays can be masked, and iii) malicious chang es of the global clock values can be detected and masked.