Noninvasive Fault Classification, Robustness and Recovery Time Measurement in Microprocessor-Type Architectures Subjected to Radiation-Induced Errors

In critical digital designs such as aerospace or safety equipment, radiation-induced upset events (single-event effects or SEEs) can produce adverse effects, and therefore, the ability to compare the sensitivity of various proposed solutions is desirable. As custom-hardened microprocessor solutions...

Full description

Saved in:
Bibliographic Details
Published inIEEE transactions on instrumentation and measurement Vol. 58; no. 5; pp. 1514 - 1524
Main Authors Guzman-Miranda, H., Aguirre, M.A., Tombs, J.
Format Journal Article
LanguageEnglish
Published New York IEEE 01.05.2009
The Institute of Electrical and Electronics Engineers, Inc. (IEEE)
Subjects
Online AccessGet full text

Cover

Loading…
More Information
Summary:In critical digital designs such as aerospace or safety equipment, radiation-induced upset events (single-event effects or SEEs) can produce adverse effects, and therefore, the ability to compare the sensitivity of various proposed solutions is desirable. As custom-hardened microprocessor solutions can be very costly, the reliability of various commercial off-the-shelf (COTS) processors can be evaluated to see if there is a commercially available microprocessor or microprocessor-type intellectual property (IP) with adequate robustness for the specific application. Most existing approaches for the measurement of this robustness of the microprocessor involve diverting the program flow and timing to introduce the bit flips via interrupts and embedded handlers added to the application program. In this paper, a tool based on an emulation platform using Xilinx field programmable gate arrays (FPGAs) is described, which provides an environment and methodology for the evaluation of the sensitivity of microprocessor architectures, using dynamic runtime fault injection. A case study is presented, where the robustness of MicroBlaze and Leon3 microprocessors executing a simple signal processing task written in C language is evaluated and compared. A hardened version of the program, where the key variables are protected, has also been tested, and its contributions to system robustness have also been evaluated. In addition, this paper presents a further improvement in the developed tool that allows not only the measurement of microprocessor robustness but, in addition, the study and classification of single-event upset (SEU) effects and the exact measurement of the recovery time (the time that the microprocessor takes to self repair and recover the fault-free state). The measurement of this recovery time is important for real-time critical applications, where criticality depends on both data correctness and timing. To demonstrate the proposed improvements, a new software program that implements two different software hardening techniques (one for Data and another for Control Flow) has been made, and a study of the recovery times in some significant fault-injection cases has been performed over the Leon3 processor.
Bibliography:ObjectType-Article-2
SourceType-Scholarly Journals-1
ObjectType-Feature-1
content type line 23
ISSN:0018-9456
1557-9662
DOI:10.1109/TIM.2009.2014603