Event storm detection and identification in communication systems [An article from: Reliability Engineering and System Safety]
Book Details
Author(s)M. Albaghdadi, B. Briley, M. Evens
PublisherElsevier
ISBN / ASINB000RR9QME
ISBN-13978B000RR9QM5
AvailabilityAvailable for download now
MarketplaceUnited States 🇺🇸
Description
This digital document is a journal article from Reliability Engineering and System Safety, published by Elsevier in 2006. The article is delivered in HTML format and is available in your Amazon.com Media Library immediately after purchase. You can view it with any web browser.
Description:
Event storms are the manifestation of an important class of abnormal behaviors in communication systems. They occur when a large number of nodes throughout the system generate a set of events within a small period of time. It is essential for network management systems to detect every event storm and identify its cause, in order to prevent and repair potential system faults. This paper presents a set of techniques for the effective detection and identification of event storms in communication systems. First, we introduce a new algorithm to synchronize events to a single node in the system. Second, the system's event log is modeled as a normally distributed random process. This is achieved by using data analysis techniques to explore and then model the statistical behavior of the event log. Third, event storm detection is proposed using a simple test statistic combined with an exponential smoothing technique to overcome the non-stationary behavior of event logs. Fourth, the system is divided into non-overlapping regions to locate the main contributing regions of a storm. We show that this technique provides us with a method for event storm identification. Finally, experimental results from a commercially deployed multimedia communication system that uses these techniques demonstrate their effectiveness.
Description:
Event storms are the manifestation of an important class of abnormal behaviors in communication systems. They occur when a large number of nodes throughout the system generate a set of events within a small period of time. It is essential for network management systems to detect every event storm and identify its cause, in order to prevent and repair potential system faults. This paper presents a set of techniques for the effective detection and identification of event storms in communication systems. First, we introduce a new algorithm to synchronize events to a single node in the system. Second, the system's event log is modeled as a normally distributed random process. This is achieved by using data analysis techniques to explore and then model the statistical behavior of the event log. Third, event storm detection is proposed using a simple test statistic combined with an exponential smoothing technique to overcome the non-stationary behavior of event logs. Fourth, the system is divided into non-overlapping regions to locate the main contributing regions of a storm. We show that this technique provides us with a method for event storm identification. Finally, experimental results from a commercially deployed multimedia communication system that uses these techniques demonstrate their effectiveness.
