Method and apparatus for automatic recovery from a failed node concurrent maintenance operation

A method, apparatus, and computer instructions are provided by the present invention to automatically recover from a failed node concurrent maintenance operation. A control logic is provided to send a first test command to processors of a new node. If the first test command is successful, a second t...

Full description

Saved in:
Bibliographic Details
Main Authors GOODMAN BENJIMAN L, FIELDS JAMES S.JR, LECOCQ PAUL F, REDDY PRAVEEN S, FLOYD MICHAEL S
Format Patent
LanguageEnglish
Published 24.08.2006
Subjects
Online AccessGet full text

Cover

Loading…
More Information
Summary:A method, apparatus, and computer instructions are provided by the present invention to automatically recover from a failed node concurrent maintenance operation. A control logic is provided to send a first test command to processors of a new node. If the first test command is successful, a second test command is sent to all processors or to the remaining nodes if nodes are removed. If the second command is successful, system operation is resumed with the newly configured topology with either nodes added or removed. If the response is incorrect or a timeout has occurred, the control logic restores values to the current mode register and sends a third test command to check for an error. A fatal system attention is sent to a service processor or system software if an error is encountered. If no error, system operation is resumed with previously configured topology.
Bibliography:Application Number: US20050054288