Transparent three-phase Byzantine fault tolerance for parallel and distributed simulations

•Propose a three-phase Byzantine Fault Tolerance (BFT) mechanism.•Integrate replication, checkpointing and message logging techniques.•Develop the BFT mechanism in a transparent manner.•Remove epidemic effects of Byzantine failures.•Evaluate the BFT mechanism using a real-world simulation model. A p...

Full description

Saved in:
Bibliographic Details
Published inSimulation modelling practice and theory Vol. 60; pp. 90 - 107
Main Authors Li, Zengxiang, Cai, Wentong, Turner, Stephen John, Qin, Zheng, Goh, Rick Siow Mong
Format Journal Article
LanguageEnglish
Published Elsevier B.V 01.01.2016
Subjects
Online AccessGet full text
ISSN1569-190X
1878-1462
DOI10.1016/j.simpat.2015.09.012

Cover

More Information
Summary:•Propose a three-phase Byzantine Fault Tolerance (BFT) mechanism.•Integrate replication, checkpointing and message logging techniques.•Develop the BFT mechanism in a transparent manner.•Remove epidemic effects of Byzantine failures.•Evaluate the BFT mechanism using a real-world simulation model. A parallel and distributed simulation (federation) is composed of a number of simulation components (federates). Since the federates may be developed by different participants and executed on different platforms, they are subject to Byzantine failures. Moreover, the failure may propagate in the federation, resulting in epidemic effect. In this article, a three-phase (i.e., detection, location, and recovery) Byzantine Fault Tolerance (BFT) mechanism is proposed based on a transparent middleware approach. The replication, checkpointing and message logging techniques are integrated in the mechanism for the purpose of enhancing simulation performance and reducing fault tolerance cost. In addition, mechanisms are provided to remove the epidemic effects of Byzantine failures. Our experiments have verified the correctness of the three-phase BFT mechanism and illustrated its high efficiency and good scalability. For some simulation executions, the BFT mechanism may even achieve performance enhancement and Byzantine fault tolerance simultaneously.
ISSN:1569-190X
1878-1462
DOI:10.1016/j.simpat.2015.09.012