Is the Future Cold or Tall? Design Space Exploration of Cryogenic and 3D Embedded Cache Memory
Memory latency, density, and power efficiency are key bottlenecks in a variety of computing systems, and the need for efficient and dense memory solutions is exacerbated by the continued importance of data-intensive applications such as machine learning, graph processing, and scientific computing. A...
Saved in:
Published in | 2023 IEEE International Symposium on Performance Analysis of Systems and Software (ISPASS) pp. 134 - 144 |
---|---|
Main Authors | , , , , |
Format | Conference Proceeding |
Language | English |
Published |
IEEE
01.04.2023
|
Subjects | |
Online Access | Get full text |
DOI | 10.1109/ISPASS57527.2023.00022 |
Cover
Loading…
Summary: | Memory latency, density, and power efficiency are key bottlenecks in a variety of computing systems, and the need for efficient and dense memory solutions is exacerbated by the continued importance of data-intensive applications such as machine learning, graph processing, and scientific computing. A myriad of emerging technologies and approaches aim to address the limitations of current systems. For example, 3D integration can enable highly dense memory structures, and multiple alternative device technologies such as STT and PCM have emerged as compelling solutions to improve memory system density and efficiency. Additionally, cryogenic operation of computing systems (i.e., ultra-low temperature cooling) is becoming a compelling solution as thermal hotspots have become a primary roadblock to conventional transistor scaling. This work probes, evaluates, and compares the potential capabilities of 3D integration, embedded non-volatile memories (eNVMs), and cryogenic operation towards improving future memory systems by presenting the first design space exploration of cryogenic operation and 3D integration applied towards the largest on-chip memory structure, the last level cache, as well as presenting and providing open-source tools for future, related design studies. This work specffically evaluates the applicationlevel benefits or limitations of such proposals by leveraging a cross-computing-stack simulation approach. Our studies reveal that the most compelling solution varies depending on the expected memory traffic patterns and workloads of interest, which in turn exposes several opportunities for future optimization and customization. For example, due to potentially high costs of cooling to cryogenic operation, we find that SRAM or 3T-eDRAM operating at 77K is sub-optimal compared to room-temperature SRAM and eNVM solutions, but exhibits advantages for relatively low-traffic workloads. |
---|---|
DOI: | 10.1109/ISPASS57527.2023.00022 |