Phase-driven learning-based dynamic reliability management for multi-core processors

In this paper, we propose a phase-driven Q-learning based dynamic reliability management (DRM) technique for multi-core processors to solve DRM problems of maximizing the processor performance subject to a large class of reliability constraints by turning ON/OFF cores and dynamic voltage frequency s...

Full description

Saved in:
Bibliographic Details
Published in2017 54th ACM/EDAC/IEEE Design Automation Conference (DAC) pp. 1 - 6
Main Authors Zhiyuan Yang, Serafy, Caleb, Tiantao Lu, Srivastava, Ankur
Format Conference Proceeding
LanguageEnglish
Published IEEE 01.06.2017
Subjects
Online AccessGet full text

Cover

Loading…
More Information
Summary:In this paper, we propose a phase-driven Q-learning based dynamic reliability management (DRM) technique for multi-core processors to solve DRM problems of maximizing the processor performance subject to a large class of reliability constraints by turning ON/OFF cores and dynamic voltage frequency scaling. Our technique utilizes the existing methods to detect program phases (i.e. [17]) and learns (rather than obtaining at the off-line stage) the optimal configuration of the multi-core processor for each phase. Our technique outperforms the existing learning-based DRM methods in managing programs with highly diverse phases. Our proposed technique is evaluated by solving a DRM problem in 3D CPUs of maximizing processor performance subject to the electromigration induced power delivery network reliability constraint. Compared to the latest Q-learning based DRM technique [11], our method can achieve more than 1.3× improvement in performance with 77% memory savings.
DOI:10.1145/3061639.3062301