ANALYZING AND DEBUGGING DISTRIBUTED EXECUTIONS

Authors
Citation
M. Raynal, ANALYZING AND DEBUGGING DISTRIBUTED EXECUTIONS, Kuwait journal of science & engineering, 1996, pp. 135-149
Citations number
22
Categorie Soggetti
Multidisciplinary Sciences
Year of publication
1996
Supplement
1
Pages
135 - 149
Database
ISI
SICI code
Abstract
Distributed programs are much more difficult to design, understand and implement than sequential or parallel ones. This is mainly due to the uncertainty created by the asynchrony inherent to distributed machine s. Therefore appropriate concepts and tools have to be devised to help the programmer of distributed applications in his task. This paper is motivated by the practical problem called distributed debugging. It p resents concepts and tools that help the programmer to analyze distrib uted executions. Two basic problems are addressed: replay of a distrib uted execution (how to reproduce an equivalent execution despite async hrony) and the detection of a stable or unstable property of a distrib uted execution. The concepts and tools presented are fundamental when designing an environment for distributed program development. This pap er is essentially a survey presenting the state-of-the-art in replay m echanisms and detection of unstable properties on global states of dis tributed executions.