Imposing coarse-grained reconfiguration to general purpose processors

Mobile devices execute applications with diverse compute and performance demands. This paper proposes a general purpose processor that adapts the underlying hardware to a given workload. Existing mobile processors need to utilize more complex heterogeneous substrates to deliver the demanded performa...

Full description

Saved in:
Bibliographic Details
Published in2015 International Conference on Embedded Computer Systems: Architectures, Modeling, and Simulation (SAMOS) pp. 42 - 51
Main Authors Duric, M., Stanic, M., Ratkovic, I., Palomar, O., Unsal, O., Cristal, A., Valero, M., Smith, A.
Format Conference Proceeding
LanguageEnglish
Published IEEE 01.07.2015
Subjects
Online AccessGet full text

Cover

Loading…
More Information
Summary:Mobile devices execute applications with diverse compute and performance demands. This paper proposes a general purpose processor that adapts the underlying hardware to a given workload. Existing mobile processors need to utilize more complex heterogeneous substrates to deliver the demanded performance. They incorporate different cores and specialized accelerators. On the contrary, our processor utilizes only modest homogeneous cores and dynamically provides an execution substrate suitable to accelerate a particular workload. Instead of incorporating accelerators, the processor reconfigures one or more cores into accelerators on-the-fly. It improves performance with minimal hardware additions. The accelerators are made of general purpose ALUs reconfigured into a compute fabric and the general purpose pipeline that streams data through the fabric. To enable reconfiguration of ALUs into the fabric, the floorplan of a 4-core processor is changed to place the ALUs in close proximity on the chip. A configurable switched network is added to couple and dynamically reconfigure the ALUs to perform computation of frequently repeated regions, instead of executing general purpose instructions. Through this reconfiguration, the mobile processor specializes its substrate for a given workload and maximizes performance of the existing resources. Our results show that reconfiguration accelerates a set of selected compute intensive workloads by 1.56×, 2,39×, 3,51×, when configuring the accelerator of 1-, 2-, or 4- cores respectively.
DOI:10.1109/SAMOS.2015.7363658