PPAS-MiCs: Peak-power-aware scheduling of fault-tolerant mixed-criticality systems

Multi-core platforms have become the dominant trend in designing Mixed-Criticality Systems (MCSs). The most well-known MCS is the dual-criticality system, which consists of high and low-criticality tasks. With the increase in the number of cores, the occurrence rate of faults has also increased in M...

Full description

Saved in:

Bibliographic Details
Published in	Sustainable computing informatics and systems Vol. 47; p. 101156
Main Authors	Shokri, Shayan, Safari, Sepideh, Hessabi, Shaahin, Ansari, Mohsen
Format	Journal Article
Language	English
Published	Elsevier Inc 01.09.2025
Subjects	Checkpointing Mixed-criticality systems Power and temperature management Checkpointing Power and temperature management Mixed-criticality systems
Online Access	Get full text

Cover

Loading…

More Information
Summary:	Multi-core platforms have become the dominant trend in designing Mixed-Criticality Systems (MCSs). The most well-known MCS is the dual-criticality system, which consists of high and low-criticality tasks. With the increase in the number of cores, the occurrence rate of faults has also increased in MCSs. For this reason, employing fault-tolerant techniques has become crucial. Although exploiting fault-tolerant techniques can improve system reliability, it might lead to increasing the temperature of the system beyond safe limits. In this paper, we present peak-power-aware scheduling for MCSs that employs the checkpointing technique while guaranteeing the timing, reliability, and thermal design power (TDP) constraints. In the proposed method, first, the minimum number of checkpoints for each task is calculated and assigned to the different execution sections of the tasks. Afterward, the cores are divided into safety-critical and non-safety-critical pairs, and tasks are mapped to cores and scheduled. It should be noted that this is a preliminary division and does not mean isolating the cores from each other. At each dedicated point in the schedule, if the TDP is violated, tasks are shifted from the last checkpoint until this constraint is not violated. Finally, the existing slack times are exploited to improve the QoS and reduce the average power consumption of the system. The proposed method is compared with the state-of-the-art fault-tolerant techniques, resulting in 35.6% and 36.5% improvement in all scenarios and in feasible scenarios, respectively, while the TDP constraint is not violated. •Employing the checkpointing technique not only to tolerate transient faults but also to manage the TDP constraint.•Providing an offline temperature-aware scheduling for scheduling all HC tasks and a maximum number of LC tasks while considering timing, temperature, and reliability constraints.•Improving the QoS through the execution of LC tasks whenever static slack times are available.•Exploiting online released slack times for reducing the average power consumption by dynamic power management.
ISSN:	2210-5379
DOI:	10.1016/j.suscom.2025.101156