A RISC-V Fault-Tolerant Soft-Processor Based on Full/Partial Heterogeneous Dual-Core Protection

The low probability of single event upsets (SEU) within particular satellite orbits, makes Commercial-off-the-shelf (COTS) electronic components a viable solution for space system implementation, thanks to the introduction of design-level fault tolerance techniques at the expense of some performance...

Full description

Saved in:
Bibliographic Details
Published inIEEE access Vol. 12; pp. 30495 - 30506
Main Authors Vigli, Francesco, Barbirotta, Marcello, Cheikh, Abdallah, Menichelli, Francesco, Mastrandrea, Antonio, Olivieri, Mauro
Format Journal Article
LanguageEnglish
Published Piscataway IEEE 2024
The Institute of Electrical and Electronics Engineers, Inc. (IEEE)
Subjects
Online AccessGet full text

Cover

Loading…
More Information
Summary:The low probability of single event upsets (SEU) within particular satellite orbits, makes Commercial-off-the-shelf (COTS) electronic components a viable solution for space system implementation, thanks to the introduction of design-level fault tolerance techniques at the expense of some performance/energy/area penalty. This paper illustrates the design and validation of a novel RISC-V dual-core architecture, based on a computing paradigm that we refer to as full/partial heterogeneous multi-core protection. The approach relies on a small, low-performance, fully fault-tolerant core (LP core) coupled with a high-performance partially fault-tolerant core (HP core). The computing paradigm assumes the failure-exposed HP core executes computation intensive routines for relatively short periods of time, making the occurrence of failures a statistically unlikely situation, while the fully fault-tolerant LP core operates in critical control tasks and manages the failure recovery of the high-performance core. The execution time percentage in the LP core varies from a minimum of 11.4% up to a maximum of 91.3% while in the HP core it is between 8.7% and 88.6%, depending on the application. In the proposed study, both the cores belong to the RISC-V compliant Klessydra core family. The dual-core architecture also includes a watchdog timer controlled by the LP core and monitoring the non-protected HP core, and a context switch FIFO that speeds up the code and data switch between the two cores during failure recovery. A dedicated run-time software environment coordinates the execution of tasks on the high-performance core in a resilient fashion. The dual-core processor has been validated through extensive RTL simulations running in an UVM-based fault-injection environment, which emulates SEUs at various rates. Experimental results illustrate the benefits and limits obtained by using a heterogeneous architecture with different levels of protection and performance. The failure probability assuming a SEU fault occurrence can be reduced by a factor between 10X and 30X with respect to the non-protected architecture, leading to an average failure rate of up to 4.00E-06 per second with respect to 1.80E-05 per second in the non-protected architecture.
ISSN:2169-3536
2169-3536
DOI:10.1109/ACCESS.2024.3366806