HARDWARE SUPPORT FOR ERROR-DETECTION IN MULTIPROCESSOR SYSTEMS - A CASE-STUDY

Citation
W. Hohl et al., HARDWARE SUPPORT FOR ERROR-DETECTION IN MULTIPROCESSOR SYSTEMS - A CASE-STUDY, Microprocessors and microsystems, 17(4), 1993, pp. 201-206
Citations number
16
Categorie Soggetti
Computer Sciences","Engineering, Eletrical & Electronic","Computer Applications & Cybernetics
ISSN journal
01419331
Volume
17
Issue
4
Year of publication
1993
Pages
201 - 206
Database
ISI
SICI code
0141-9331(1993)17:4<201:HSFEIM>2.0.ZU;2-6
Abstract
A comparison of the most important methods for error detection in mult iprocessor systems is presented based upon the experiences gained in t he development of the fault-tolerant multiprocessor system MEMSY. A de tailed comparison between watchdog processors and master-checker type duplication based fault tolerance is given, from the point of view of fault coverage, hardware and time overhead. It is shown that a simple multiplication in itself is insufficient to assure proper error detect ion features, especially if a low error latency time is required. Desi gn guidelines are presented for the effective use of the duplication, based on the master-checker mode. Additionally a new general purpose w atchdog processor architecture is proposed, which monitors the behavio ur of the main processor by checking the control flow of processes usi ng an extended signature integrity checking (ESIC) method. The watchdo g processor is independent of the architecture of the main processor b ecause it is linked to the main processor by a memory interface. The w atchdog processor is convenient for multiprocessor systems based on st andard components and a RISC/CISC processor with large cache as node p rocessor.