An efficient algorithm for locating soft and hard failures in WDM networks

Authors
Citation
C. Mas et P. Thiran, An efficient algorithm for locating soft and hard failures in WDM networks, IEEE J SEL, 18(10), 2000, pp. 1900-1911
Citations number
16
Categorie Soggetti
Information Tecnology & Communication Systems
Journal title
IEEE JOURNAL ON SELECTED AREAS IN COMMUNICATIONS
ISSN journal
07338716 → ACNP
Volume
18
Issue
10
Year of publication
2000
Pages
1900 - 1911
Database
ISI
SICI code
0733-8716(200010)18:10<1900:AEAFLS>2.0.ZU;2-A
Abstract
Fault identification and location in optical networks is hampered by a mult itude of factors: the redundancy and the lack of coordination (internetwork ing) of the managements at the different layers (WDM, SDH/SONET, ATM, IF); the large number of alarms a single failure can trigger; the difficulty in detecting some failures; and the resulting need to cope with missing or fal se alarms. Moreover, the problem of multiple fault location is NP-complete, so that the processing time may become an issue for large meshed optical n etworks. We propose an algorithm for locating multiple failures at the physical laye r of a WDM network. They can be either hard failures, that is, unexpected e vents that suddenly interrupt the established channels; or soft failures, t hat is, events that progressively degrade the quality of transmission; or b oth, Hard failures' are detected at the WDM layer. Soft failures can someti mes be detected at the optical layer if proper testing equipment is deploye d, but often require performance monitoring at a higher layer (SDH, ATM, or IF). Both types of failures, and both types of error monitoring, are incor porated in our algorithm, which is based on a classification and abstractio n of the components of the optical layer and of the upper layer. Our algori thm does not rely on timestamps nor on failure probabilities, which are dif ficult to estimate and to use in practice, Moreover, our algorithm also han dles missing and false alarms. The nonpolynomial computational complexity o f the problem is pushed ahead into a precomputational phase, which is done off-line, when the optical channels are set up or cleared down. This result s in fast on-line location of the failing components upon reception of the ringing alarms.