Experimental evaluation of failure-detection schemes in real-time communication networks

An effective failure-detection scheme is essential for reliable communication services. Most computer network rely on behavior-based detection schemes: each node uses heartbeats to detect the failure of its neighbor nodes, and the transport protocol (like TCP) achieves reliable communication by ackn...

Full description

Saved in:
Bibliographic Details
Published inProceedings of IEEE 27th International Symposium on Fault Tolerant Computing pp. 122 - 131
Main Authors Seungjae Han, Shin, K.G.
Format Conference Proceeding
LanguageEnglish
Published IEEE 1997
Subjects
Online AccessGet full text

Cover

Loading…
More Information
Summary:An effective failure-detection scheme is essential for reliable communication services. Most computer network rely on behavior-based detection schemes: each node uses heartbeats to detect the failure of its neighbor nodes, and the transport protocol (like TCP) achieves reliable communication by acknowledgment/retransmission. In this paper, we experimentally evaluate the effectiveness of such behavior-based detection schemes in real-time communication. Specifically, we measure and analyze the coverage and latency of two failure-detection schemes-neighbor detection and end-to-end detection-through fault-injection experiments. The experimental results have shown that a significant portion of failures can be detected very quickly by the neighbor detection scheme, while the end-to-end detection scheme uncovers the remaining failures with larger detection latencies.
ISBN:9780818678318
0818678313
ISSN:0731-3071
2375-124X
DOI:10.1109/FTCS.1997.614085