Speculative synchronization for coherence-free embedded NUMA architectures

High-end embedded systems, like their general-purpose counterparts, are turning to many-core cluster-based shared-memory architectures that provide a shared memory abstraction subject to non-uniform memory access (NUMA) costs. In order to keep the cores and memory hierarchy simple, many-core embedde...

Full description

Saved in:
Bibliographic Details
Published in2014 International Conference on Embedded Computer Systems: Architectures, Modeling, and Simulation (SAMOS XIV) pp. 99 - 106
Main Authors Papagiannopoulou, Dimitra, Moreshet, Tali, Marongiu, Andrea, Benini, Luca, Herlihy, Maurice, Iris Bahar, R.
Format Conference Proceeding
LanguageEnglish
Published IEEE 01.07.2014
Subjects
Online AccessGet full text
DOI10.1109/SAMOS.2014.6893200

Cover

Loading…
More Information
Summary:High-end embedded systems, like their general-purpose counterparts, are turning to many-core cluster-based shared-memory architectures that provide a shared memory abstraction subject to non-uniform memory access (NUMA) costs. In order to keep the cores and memory hierarchy simple, many-core embedded systems tend to employ simple, scratchpad-like memories, rather than hardware managed caches that require some form of cache coherence management. These "coherence-free" systems still require some means to synchronize memory accesses and guarantee memory consistency. Conventional lock-based approaches may be employed to accomplish the synchronization, but may lead to both useability and performance issues. Instead, speculative synchronization, such as hardware transactional memory, may be a more attractive approach. However, hardware speculative techniques traditionally rely on the underlying cache-coherence protocol to synchronize memory accesses among the cores. The lack of a cache-coherence protocol adds new challenges in the design of hardware speculative support. In this paper, we present a new scheme for hardware transactional memory support within a cluster-based NUMA system that lacks an underlying cache-coherence protocol. To the best of our knowledge, this is the first design for speculative synchronization for this type of architecture. Through a set of benchmark experiments using our simulation platform, we show that our design can achieve significant performance improvements over traditional lock-based schemes.
DOI:10.1109/SAMOS.2014.6893200