METHODS FOR OBSERVING GLOBAL PROPERTIES IN DISTRIBUTED SYSTEMS

Authors
Citation
Vk. Garg, METHODS FOR OBSERVING GLOBAL PROPERTIES IN DISTRIBUTED SYSTEMS, IEEE concurrency, 5(4), 1997, pp. 69
Citations number
19
Categorie Soggetti
Computer Sciences","Computer Science Theory & Methods
Journal title
Volume
5
Issue
4
Year of publication
1997
Database
ISI
SICI code
Abstract
A Fundamental problem in developing distributed software is that no pr ocess has access to the global state. Thus, computing a global predica te or function-a need that occurs frequently in many distributed syste ms-typically requires significant programming. Being able to observe a distributed computation is useful for many fundamental problems in di stributed software, such as debugging, testing, and fault-tolerance. A fter ? program is debugged and tested, it must be monitored for fault- tolerance, again requiring something that will observe the global stat e. Finally, the ability to observe global predicates generalizes algor ithms for many previous problems such as detecting program termination , token loss, and deadlock. Research on how to detect global predicate s has yielded three sets of algorithms. In the global snapshot algorit hm, global snapshots of the computation are repeatedly computed until the desired predicate becomes true. However, this approach works only for stable predicates like deadlock and termination, which do not turn false once they become true. In the second set of algorithms, a latti ce of global states is constructed. Unlike the global snapshot approac h, this approach lets users detect unstable predicates. How ever, it c an mean exploring a prohibitive number of global states. This article surveys algorithms that use a third approach, which exploits the struc ture of the predicate, but does not build a lattice. Instead, they exa mine the computation itself to deduce if a predicate became rme. These algorithms are computationally efficient and can be used to detect ev en unstable predicates.