Optimal sample length for efficient cache simulation

Architectural simulations of microprocessors are extremely time-consuming nowadays due to the ever increasing complexity of current applications. In order to get realistic workloads on current hardware, benchmarks need to be constructed with huge dynamic instruction counts. For example, SPEC release...

Full description

Saved in:

Bibliographic Details
Published in	Journal of systems architecture Vol. 51; no. 9; pp. 513 - 525
Main Authors	Eeckhout, Lieven, Niar, Smaïl, De Bosschere, Koen
Format	Journal Article
Language	English
Published	Amsterdam Elsevier B.V 01.09.2005 Elsevier Sequoia S.A
Subjects	Benchmarks Cache Cold-start problem Computer architecture Performance analysis Simulation Studies Trace sampling Trace sampling Performance analysis Cold-start problem Computer architecture
Online Access	Get full text

Cover

Loading…

More Information
Summary:	Architectural simulations of microprocessors are extremely time-consuming nowadays due to the ever increasing complexity of current applications. In order to get realistic workloads on current hardware, benchmarks need to be constructed with huge dynamic instruction counts. For example, SPEC released the CPU2000 benchmark suite containing benchmarks that have a dynamic instruction count of several hundreds of billions of instructions. This is beneficial for real hardware evaluation. However, simulating these workloads is impractical if not impossible if we take into account that many simulation runs are needed in order to evaluate a large number of design points. Trace sampling is often used as a practical solution for this problem. In trace sampling, several representative samples are chosen from a real program trace. Since the sampled trace is much shorter than the original trace, a significant simulation speedup is obtained. In this paper, we study what is the optimal sample size to achieve a given level of accuracy while maximizing the total simulation speedup. From various experiments using SPEC CPU2000, we conclude that the optimal sample length (i) is not fixed over benchmarks, and (ii) increases with increasing warmup lengths. As such, we propose an algorithm that determines the optimal sample length per benchmark under different warmup scenarios. This is done within the context of sampled cache simulation.
ISSN:	1383-7621 1873-6165
DOI:	10.1016/j.sysarc.2004.12.004