FAULT-TOLERANT DISTRIBUTED SUBCUBE MANAGEMENT SCHEME FOR HYPERCUBE MULTICOMPUTER SYSTEMS

Authors
Citation
Yl. Chen et Jc. Liu, FAULT-TOLERANT DISTRIBUTED SUBCUBE MANAGEMENT SCHEME FOR HYPERCUBE MULTICOMPUTER SYSTEMS, IEEE transactions on parallel and distributed systems, 6(7), 1995, pp. 766-772
Citations number
18
Categorie Soggetti
System Science","Engineering, Eletrical & Electronic","Computer Science Theory & Methods
ISSN journal
10459219
Volume
6
Issue
7
Year of publication
1995
Pages
766 - 772
Database
ISI
SICI code
1045-9219(1995)6:7<766:FDSMSF>2.0.ZU;2-S
Abstract
This paper proposes a fault tolerant distributed subcube management sc heme for hypercube multicomputer systems. Gracefully degradable subcub e management is supported by a data structure, called the distributed subcube table (DST), and a fault tolerant broadcast protocol, called t he reliably synchronized broadcast (RSB). In an n-dimensional hypercub e, DST is the collection of 2(n) local subcube tables (LSTs), DST = {L ST(0), LST(1), ..., LST(2-1)(n)}, where LST(x) is a bit mapped table a ssigned to N-x, a fault free node whose address is x. LST(x), For All x, is n + 1 bits long, and it records the status (free/busy) of certai n subcubes adjacent to N-x. The RSB diagnoses and avoids faults during interprocessor communication to prevent faulty nodes from being alloc ated for job execution. In addition to possessing a fault-tolerant des ign, our scheme can also achieve comparable or better performance than existing centralized schemes, as verified by extensive simulation.