Compiler Managed Dynamic Instruction Placement in a Low-Power Code Cache

Modern embedded microprocessors use low power on-chip memories called scratch-pad memories to store frequently executed instructions and data. Unlike traditional caches, scratch-pad memories lack the complex tag checking and comparison logic, thereby proving to be efficient in area and power. In thi...

Full description

Saved in:
Bibliographic Details
Published inCode Generation and Optimization: Proceedings of the international symposium on Code generation and optimization; 20-23 Mar. 2005 pp. 179 - 190
Main Authors Ravindran, Rajiv A., Nagarkar, Pracheeti D., Dasika, Ganesh S., Marsman, Eric D., Senger, Robert M., Mahlke, Scott A., Brown, Richard B.
Format Conference Proceeding
LanguageEnglish
Published Washington, DC, USA IEEE Computer Society 20.03.2005
IEEE
SeriesACM Conferences
Subjects
Online AccessGet full text

Cover

Loading…
More Information
Summary:Modern embedded microprocessors use low power on-chip memories called scratch-pad memories to store frequently executed instructions and data. Unlike traditional caches, scratch-pad memories lack the complex tag checking and comparison logic, thereby proving to be efficient in area and power. In this work, we focus on exploiting scratch-pad memories for storing hot code segments within an application. Static placement techniques focus on placing the most frequently executed portions of programs into the scratch-pad. However, static schemes are inherently limited by not allowing the contents of the scratch-pad memory to change at run time. In a large fraction of applications, the instruction memory footprints exceed the scratch-pad memory size, thereby limiting the usefulness of the scratch-pad. We propose a compiler managed dynamic placement algorithm, wherein multiple hot code sequences, or traces, are overlapped with each other in the scratch-pad memory at different points in time during execution. Special copy instructions are provided to copy the traces into the scratch-pad memory at run-time. Using a power estimate, the compiler initially selects the most frequent traces in an application for relocation into the scratch-pad memory. Through iterative code motion and redundancy elimination, copy instructions are inserted in infrequently executed regions of the code. For a 64-byte code cache, the compiler managed dynamic placement achieves an average of 64% energy improvement over the static solution in a low-power embedded microcontroller.
Bibliography:SourceType-Conference Papers & Proceedings-1
ObjectType-Conference Paper-1
content type line 25
ISBN:9780769522982
076952298X
DOI:10.1109/CGO.2005.13