COMPLETE AND PARTIAL FAULT-TOLERANCE OF FEEDFORWARD NEURAL NETS

Authors
Citation
Ds. Phatak et I. Koren, COMPLETE AND PARTIAL FAULT-TOLERANCE OF FEEDFORWARD NEURAL NETS, IEEE transactions on neural networks, 6(2), 1995, pp. 446-456
Citations number
29
Categorie Soggetti
Computer Application, Chemistry & Engineering","Engineering, Eletrical & Electronic","Computer Science Artificial Intelligence","Computer Science Hardware & Architecture","Computer Science Theory & Methods
ISSN journal
10459227
Volume
6
Issue
2
Year of publication
1995
Pages
446 - 456
Database
ISI
SICI code
1045-9227(1995)6:2<446:CAPFOF>2.0.ZU;2-D
Abstract
A method is proposed to estimate the fault tolerance of feedforward ar tificial neural nets (ANN's) and synthesize robust nets. The fault mod el abstracts a variety of failure modes of hardware implementations to permanent stuck-at type faults of single components. A procedure is d eveloped to build fault tolerance ANN's by replicating the hidden unit s. It exploits the intrinsic weighted summation operation performed by the processing units to overcome faults. It is simple, robust, and ap plicable to any feedforward net. Based on this procedure, metrics are devised to quantify the fault tolerance as a function of redundancy. F urthermore, a lower bound on the redundancy required to tolerate all p ossible single faults is analytically derived. This bound demonstrates that less than triple modular redundancy (TMR) cannot provide complet e fault tolerance for all possible single faults. This general result establishes a necessary condition that holds for all feedforward nets, regardless of the network topology or the task it is trained on. Anal ytical as well as extensive simulation results indicate that the actua l redundancy needed to synthesize a completely fault tolerant net is s pecific to the problem at hand and is usually much higher than that di ctated by the general lower bound. The data implies that the conventio nal TMR scheme of triplication and majority vote is the best way to ac hieve complete fault tolerance in most ANN's. Although the redundancy needed for complete fault tolerance is substantial, the results do sho w that ANN's exhibit good paritial fault tolerance to begin with (i.e. , without any extra redundancy) and degrade gracefully. The first repl ication is seen to yield maximum enhancement in partial fault toleranc e compared with later successive replications. For large nets, exhaust ive testing of all possible single faults is prohibitive. Hence the st rategy of randomly testing a small fraction of the total number of lin ks is adopted. It yields partial fault tolerance estimates that are ve ry close to those obtained by exhaustive testing. Moreover, when the f raction of links tested is held fixed, the accuracy of the estimate ge nerated by random testing is seen to improve as the net size grows.