ITA
ENG

TASK ALLOCATION AND REALLOCATION FOR FAULT-TOLERANCE IN MULTICOMPUTERSYSTEMS

Authors

CHEN CIH CHERKASSKY V

Citation

Cih. Chen et V. Cherkassky, TASK ALLOCATION AND REALLOCATION FOR FAULT-TOLERANCE IN MULTICOMPUTERSYSTEMS, IEEE transactions on aerospace and electronic systems, 30(4), 1994, pp. 1094-1104

Citations number

Categorie Soggetti

Telecommunications,"Engineering, Eletrical & Electronic","Aerospace Engineering & Tecnology

Journal title

IEEE transactions on aerospace and electronic systems → ACNP

ISSN journal

00189251

Volume

Issue

Year of publication

1994

Pages

1094 - 1104

Database

ISI

SICI code

0018-9251(1994)30:4<1094:TAARFF>2.0.ZU;2-N

Abstract

The goal of task allocation in a set of interconnected processors (com puters) is to maximize the efficient use of resources and thus reduce the job turnaround time. Proposed here a simple yet effective method t o allocate the tasks in multicomputer systems for minimizing the inter processor communication cost subject to resource limitations defined b y the system and designer. The limitations can be viewed as results fr om the load balancing since the execution time of each task, the numbe r of available processors, processor speed, and memory capacity are kn own to the system or designer. As the number of processors increases, the probability of a failure existing somewhere in the systems at any time also increases. Very few established task allocation models have considered the reliability property. In multicomputer systems, we defi ne system reliability as the probability that the system can run the t asks successfully. After the (nonredundant) task scheduling strategy i s defined, tasks are then reallocated to processors statically and red undantly. This is a form of time redundancy, in which if some processo rs fail during the execution, all tasks can be completed on the remain ing processors (but at a longer time). Due to static preallocation of tasks this method is simpler and thus more practical than well-known d ynamic reconfiguration and rollback recovery techniques in multicomput er systems. We demonstrate the effectiveness of the task allocation an d reallocation for hardware fault tolerance by illustrations of applyi ng the methods to different examples and practical communications netw ork multiprocessor systems.