A Stacked FPGA utilizing 3D-SRAM with Latency Optimization

AI technology Is rapidly evolving, often leading to significant changes in model architectures due to ongoing research and development. When designing AI hardware accelerators, flexibility is crucial to accommodate potential future changes in AI model structures. FPGAs, increasingly utilized as AI a...

Full description

Saved in:
Bibliographic Details
Published inProceedings (IEEE International Symposium on Embedded Multicore/Manycore SoCs. Online) pp. 400 - 406
Main Authors Takahashi, Ryo, Ando, Kota, Nakahara, Hiroki
Format Conference Proceeding
LanguageEnglish
Published IEEE 16.12.2024
Subjects
Online AccessGet full text
ISSN2771-3075
DOI10.1109/MCSoC64144.2024.00072

Cover

More Information
Summary:AI technology Is rapidly evolving, often leading to significant changes in model architectures due to ongoing research and development. When designing AI hardware accelerators, flexibility is crucial to accommodate potential future changes in AI model structures. FPGAs, increasingly utilized as AI accelerators, perform logical representations using LUTs and are characterized by their high flexibility. Executing computations via LUTs rather than dedicated circuits is important to achieve this flexibility. Generally, increasing the number of inputs to a LUT reduces FPGA latency but increases area. The use of 3D-SRAM, which is gaining practical application, may allow for an increase in LUT inputs without a significant area increase, potentially enhancing FPGA performance. However, 3D-SRAM has higher latency compared to the SRAM traditionally used in LUTs. As a result, the total latency reduction from increasing the number of LUT inputs may be negated. In this study, we developed a simulator for an FPGA equipped with large-input LUTs using 3D-SRAM and conducted performance comparison experiments with conventional FPGAs.
ISSN:2771-3075
DOI:10.1109/MCSoC64144.2024.00072