Efficient GPU-accelerated parallel cross-correlation

Cross-correlation is a data analysis method widely employed in various signal processing and similarity-search applications. Our objective is to design a highly optimized GPU-accelerated implementation that will speed up the applications and also improve energy efficiency since GPUs are more efficie...

Full description

Saved in:
Bibliographic Details
Published inJournal of parallel and distributed computing Vol. 199; p. 105054
Main Authors Maděra, Karel, Šmelko, Adam, Kruliš, Martin
Format Journal Article
LanguageEnglish
Published Elsevier Inc 01.05.2025
Subjects
Online AccessGet full text

Cover

Loading…
More Information
Summary:Cross-correlation is a data analysis method widely employed in various signal processing and similarity-search applications. Our objective is to design a highly optimized GPU-accelerated implementation that will speed up the applications and also improve energy efficiency since GPUs are more efficient than CPUs in data-parallel tasks. There are two rudimentary ways to compute cross-correlation — a definition-based algorithm that tries all possible overlaps and an algorithm based on the Fourier transform, which is much more complex but has better asymptotical time complexity. We have focused mainly on the definition-based approach which is better suited for smaller input data and we have implemented multiple CUDA-enabled algorithms with multiple optimization options. The algorithms were evaluated on various scenarios, including the most typical types of multi-signal correlations, and we provide empirically verified optimal solutions for each of the studied scenarios. •Novel GPU-accelerated cross-correlation that uses warp-shuffles and speeds up the computation by an order of magnitude.•Several caching optimizations improve the algorithm further in specific cases.•Detailed analysis of data access patterns and thorough discussion about possible registry caching opportunities.•Empirical evaluation of individual optimizations, complete solutions (various scenarios), and comparison with baselines.•Guidelines for selecting the best combination of algorithms and optimizations based on problem configuration and input size.
ISSN:0743-7315
DOI:10.1016/j.jpdc.2025.105054