Threshold-based reconfiguration strategies for gracefully degradable parallel computations

Citation
M. Colajanni et al., Threshold-based reconfiguration strategies for gracefully degradable parallel computations, J PAR DISTR, 55(1), 1998, pp. 138-151
Citations number
18
Categorie Soggetti
Computer Science & Engineering
Journal title
JOURNAL OF PARALLEL AND DISTRIBUTED COMPUTING
ISSN journal
07437315 → ACNP
Volume
55
Issue
1
Year of publication
1998
Pages
138 - 151
Database
ISI
SICI code
0743-7315(19981125)55:1<138:TRSFGD>2.0.ZU;2-A
Abstract
The occurrence of faults in multicomputers with hundreds or thousands of no des is a likely event that can be dealt with hardware or software fault-tol erant approaches. This paper presents a unifying model that describes softw are reconfiguration strategies for parallel applications with regular compu tational pattern. We show that most existing strategies can be obtained as instances of the proposed threshold-based reconfiguration meta-algorithm. M oreover, this approach is useful to discover several yet unexplored strateg ies among which we consider the class of the adaptive threshold-based strat egies. The performance optimization analysis demonstrates that these strate gies, applied to data-parallel regular computations, give optimal results f or worst fault patterns. A wide spectrum of simulations, where the system p arameters have been settled to those of actual multicomputers, confirms tha t adaptive threshold-based strategies yield the most stable performance for a variety of workloads, independently of the number and pattern of faults. (C) 1998 Academic Press.