ITA
ENG

An analytical model for a parallel fault-tolerant computing system

Authors

Persone, VD Grassi, V

Citation

Vd. Persone et V. Grassi, An analytical model for a parallel fault-tolerant computing system, PERF EVAL, 38(3-4), 1999, pp. 201-218

Citations number

Categorie Soggetti

Computer Science & Engineering

Journal title

PERFORMANCE EVALUATION

ISSN journal

01665316 → ACNP

Volume

Issue

3-4

Year of publication

1999

Pages

201 - 218

Database

ISI

SICI code

0166-5316(199912)38:3-4<201:AAMFAP>2.0.ZU;2-K

Abstract

We present an analytical model of a parallel computing system. Since the pr obability of fault occurrence is non-negligible, the model takes into consi deration fault-tolerance issues, by combining results obtained from a perfo rmance model with a fault/repair model. To this purpose, the system perform ance must be evaluated under several different configurations, caused by th e occurrence of faults and repairs. This requires efficient solution techni ques of the performance model. The model we adopt is based on an extended q ueueing network. The queueing network includes a fork/join subnetwork with finite capacity, and three different blocking models to manage saturation c ondition: blocking before service (BBS), Repetitive Service or Blocking Aft er Service. We prove that the underlying Markov process has a particular st ructure suitable for efficient solution. To show a possible use of such a model, we present numerical results for a particular maintenance policy, looking for the optimal trade-off between th e frequency of service interruption due to repair operations and the need o f avoiding excessive performance degradation. (C)1999 Elsevier Science B.V. All rights reserved.