ITA
ENG

Staggered consistent checkpointing

Authors

Vaidya, NH

Citation

Nh. Vaidya, Staggered consistent checkpointing, IEEE PARALL, 10(7), 1999, pp. 694-702

Citations number

Categorie Soggetti

Computer Science & Engineering

Journal title

IEEE TRANSACTIONS ON PARALLEL AND DISTRIBUTED SYSTEMS

ISSN journal

10459219 → ACNP

Volume

Issue

Year of publication

1999

Pages

694 - 702

Database

ISI

SICI code

1045-9219(199907)10:7<694:SCC>2.0.ZU;2-9

Abstract

A consistent checkpointing algorithm saves a consistent view of a distribut ed application's state on stable storage. The traditional consistent checkp ointing algorithms require different processes to save their state at about the same time. This causes contention for the stable storage, potentially resulting in large overheads. Staggering the checkpoints taken by various p rocesses can reduce checkpoint overhead. This paper presents a simple appro ach to arbitrarily stagger the checkpoints. Our approach requires that the processes take consistent logical checkpoints, as compared to consistent ph ysical checkpoints enforced by existing algorithms. Experimental results on nCube-2 are presented.