Portable and efficient FFT and DCT algorithms with the Heterogeneous Butterfly Processing Library
The existence of a wide variety of computing devices with very different properties makes essential the development of software that is not only portable among them, but which also adapts to the properties of each platform. In this paper, we present the Heterogeneous Butterfly Processing Library (HB...
Saved in:
Published in | Journal of parallel and distributed computing Vol. 125; pp. 135 - 146 |
---|---|
Main Authors | , , |
Format | Journal Article |
Language | English |
Published |
Elsevier Inc
01.03.2019
|
Subjects | |
Online Access | Get full text |
Cover
Loading…
Summary: | The existence of a wide variety of computing devices with very different properties makes essential the development of software that is not only portable among them, but which also adapts to the properties of each platform. In this paper, we present the Heterogeneous Butterfly Processing Library (HBPL), which provides optimized portable kernels for problems of small sizes that allow using orthogonal transform algorithms such as the FFT and DCT on different accelerators and regular CPUs. Our library is implemented on the OpenCL standard, which provides portability on a large number of platforms. Furthermore, high performance is achieved on a wide range of devices by exploiting run-time code generation and metaprogramming guided by a parametrization strategy. An exhaustive evaluation on different platforms shows that our proposal obtains competitive or better performance than related libraries.
•We present HBPL, a portable OpenCL-based library for FFT and DCT using kernels.•HBPL provides an analytical model for the search for the best implementations.•The model can pinpoint the best implementation or restrict the search to minutes.•HBPL shows better performance and portability than clFFT on GPU’s, the most related library.•We provide the first portable implementation of the DCT we know of. |
---|---|
ISSN: | 0743-7315 1096-0848 |
DOI: | 10.1016/j.jpdc.2018.11.011 |