Interoperable run-time tools for distributed systems - A case study

Citation
R. Wismuller et T. Ludwig, Interoperable run-time tools for distributed systems - A case study, J SUPERCOMP, 17(3), 2000, pp. 277-289
Citations number
18
Categorie Soggetti
Computer Science & Engineering
Journal title
JOURNAL OF SUPERCOMPUTING
ISSN journal
09208542 → ACNP
Volume
17
Issue
3
Year of publication
2000
Pages
277 - 289
Database
ISI
SICI code
0920-8542(200011)17:3<277:IRTFDS>2.0.ZU;2-7
Abstract
Tools that observe and manipulate the run-time behavior of parallel and dis tributed systems are essential for developing and maintaining these systems . Sometimes users would even need to use several tools at the same time in order to have a higher functionality at their disposal. Today, tools develo ped independently by different vendors are, however, not able to interopera te. Interoperability not only allows concurrent use of tools, but also can lead to an added value for the user. A debugger interoperating with a check pointing system, for example, can provide a debugging environment where the debugged program can be reset to any previous state, thus speeding up cycl ic debugging for long running programs. Using this example scenario, we derive requirements that should be met by t he tools' software infrastructure in order to enable interoperability. A re view of existing infrastructures shows that these requirements are only par tially met today. In an ongoing research effort, support for all of the req uirements is built into the OMIS compliant on-line monitoring system OCM.