Virtual machine fault tolerance

In a computer system running a primary virtual machine (VM) on virtualization software on a primary virtualized computer system (VCS) and running a secondary VM on virtualization software on a secondary VCS, a method for the secondary VM to provide quasi-lockstep fault tolerance for the primary VM i...

Full description

Saved in:
Bibliographic Details
Main Authors VENKITACHALAM GANESH, SHELDON JEFFREY W, JAIN ROHIT, SCALES DANIEL J, MALYUGIN VYACHESLAV, XU MIN, WEISSMAN BORIS
Format Patent
LanguageEnglish
Published 12.06.2012
Subjects
Online AccessGet full text

Cover

Loading…
More Information
Summary:In a computer system running a primary virtual machine (VM) on virtualization software on a primary virtualized computer system (VCS) and running a secondary VM on virtualization software on a secondary VCS, a method for the secondary VM to provide quasi-lockstep fault tolerance for the primary VM includes: as the primary VM is executing a workload, virtualization software in the primary VCS is: (a) causing predetermined events to be recorded in an event log, (b) keeping output associated with the predetermined events pending, and (c) sending the log entries to the virtualization software in the secondary VCS; as the secondary VM is replaying the workload, virtualization software in the secondary VCS is: (a) sending acknowledgements indicating that log entries have been received; (b) when the virtualization software encounters one of the predetermined events, searching the log entries to determine whether a log entry corresponding to the same event was received from the primary VCS, and if so, comparing data associated with the predetermined event produced by the secondary VM with that of the primary VM; if there is a match, the virtualization software in the secondary VCS transmitting an acknowledgement to the virtualization software in the primary VCS; one of the virtualization software in the primary or secondary VCS dropping the event and the other dispatching the output; and if there is no match, performing a checkpoint resynchronization.
Bibliography:Application Number: US20090484640