Resource-efficient acceleration of 2-dimensional Fast Fourier Transform computations on FPGAs
The 2-dimensional (2D) fast Fourier transform (FFT) is a fundamental, computationally intensive function that is of broad relevance to distributed smart camera systems. In this paper, we develop a systematic method for improving the throughput of 2D-FFT implementations on field-programmable gate arr...
Saved in:
Published in | 2009 Third ACM/IEEE International Conference on Distributed Smart Cameras (ICDSC) pp. 1 - 8 |
---|---|
Main Authors | , , , |
Format | Conference Proceeding |
Language | English |
Published |
IEEE
01.08.2009
|
Subjects | |
Online Access | Get full text |
Cover
Loading…
Summary: | The 2-dimensional (2D) fast Fourier transform (FFT) is a fundamental, computationally intensive function that is of broad relevance to distributed smart camera systems. In this paper, we develop a systematic method for improving the throughput of 2D-FFT implementations on field-programmable gate arrays (FPGAs). Our method is based on a novel loop unrolling technique for FFT implementation, which is extended from our recent work on FPGA architectures for 1D-FFT implementation. This unrolling technique deploys multiple processing units within a single 1D-FFT core to achieve efficient configurations of data parallelism while minimizing memory space requirements, and FPGA slice consumption. Furthermore, using our techniques for parallel processing within individual 1DFFT cores, the number of input/output (I/O) ports within a given 1D-FFT core is limited to one input port and one output port. In contrast, previous 2D-FFT design approaches require multiple I/O pairs with multiple FFT cores. This streamlining of 1D-FFT interfaces makes it possible to avoid complex interconnection networks and associated scheduling logic for connecting multiple I/O ports from 1D-FFT cores to the I/O channel of external memory devices. Hence, our proposed unrolling technique maximizes the ratio of the achieved throughput to the consumed FPGA resources under pre-defined constraints on I/O channel bandwidth. To provide generality, our framework for 2D-FFT implementation can be efficiently parameterized in terms of key design parameters such as the transform size and I/O data word length. |
---|---|
DOI: | 10.1109/ICDSC.2009.5289356 |