Student research poster: Software out-of-order execution for in-order architectures

Processor cores are divided into two categories: fast and power-hungry out-of-order processors, and efficient, but slower in-order processors. To achieve high performance with lowenergy budgets, this proposal aims to deliver out-of-order processing by software (SWOOP) on in-order architectures. Prob...

Full description

Saved in:
Bibliographic Details
Published in2016 International Conference on Parallel Architecture and Compilation Techniques (PACT) p. 458
Main Author Kim-Anh Tran
Format Conference Proceeding
LanguageEnglish
Published ACM 01.09.2016
Subjects
Online AccessGet full text

Cover

Loading…
More Information
Summary:Processor cores are divided into two categories: fast and power-hungry out-of-order processors, and efficient, but slower in-order processors. To achieve high performance with lowenergy budgets, this proposal aims to deliver out-of-order processing by software (SWOOP) on in-order architectures. Problem: A primary cause for slowdown in in-order processors is last-level cache misses (caused by difficult to predict data-dependent loads), resulting in cores stalling. Solution: As loads are non-blocking operations, independent instructions are scheduled to run before the loads return. We execute critical load instructions earlier in the program for a three-fold benefit: increasing memory and instruction level parallelism, and hiding memory latency. Related work: Some instruction scheduling policies attempt to hide memory latency, but scheduling is confined by basic block limits and register pressure. Software pipelining [3] is restricted by dependencies between instructions and decoupled access-execute (DAE) [1] suffers from address re-computation. Unlike EPIC [2] (evolved from VLIW), SWOOP does not require hardware support for predicated execution, speculative loads and their verification, delayed exception handling, memory disambiguation etc.
DOI:10.1145/2967938.2971466