GPU accelerated dynamic functional connectivity analysis for functional MRI data

•First dynamic functional connectivity (DFC) analysis study using GPU and OpenMP.•We proposed two parallel algorithms for DFC analysis.•CUDA- and OpenMP-based algorithms are implemented and tested FRMI datasets.•In CUDA, thread- and block-based approaches were analyzed, discussed, and compared.•A CU...

Full description

Saved in:
Bibliographic Details
Published inComputerized medical imaging and graphics Vol. 43; pp. 53 - 63
Main Authors Akgün, Devrim, Sakoğlu, Ünal, Esquivel, Johnny, Adinoff, Bryon, Mete, Mutlu
Format Journal Article
LanguageEnglish
Published United States Elsevier Ltd 01.07.2015
Subjects
Online AccessGet full text

Cover

Loading…
More Information
Summary:•First dynamic functional connectivity (DFC) analysis study using GPU and OpenMP.•We proposed two parallel algorithms for DFC analysis.•CUDA- and OpenMP-based algorithms are implemented and tested FRMI datasets.•In CUDA, thread- and block-based approaches were analyzed, discussed, and compared.•A CUDA-based design reached up to 157× speedup. Recent advances in multi-core processors and graphics card based computational technologies have paved the way for an improved and dynamic utilization of parallel computing techniques. Numerous applications have been implemented for the acceleration of computationally-intensive problems in various computational science fields including bioinformatics, in which big data problems are prevalent. In neuroimaging, dynamic functional connectivity (DFC) analysis is a computationally demanding method used to investigate dynamic functional interactions among different brain regions or networks identified with functional magnetic resonance imaging (fMRI) data. In this study, we implemented and analyzed a parallel DFC algorithm based on thread-based and block-based approaches. The thread-based approach was designed to parallelize DFC computations and was implemented in both Open Multi-Processing (OpenMP) and Compute Unified Device Architecture (CUDA) programming platforms. Another approach developed in this study to better utilize CUDA architecture is the block-based approach, where parallelization involves smaller parts of fMRI time-courses obtained by sliding-windows. Experimental results showed that the proposed parallel design solutions enabled by the GPUs significantly reduce the computation time for DFC analysis. Multicore implementation using OpenMP on 8-core processor provides up to 7.7× speed-up. GPU implementation using CUDA yielded substantial accelerations ranging from 18.5× to 157× speed-up once thread-based and block-based approaches were combined in the analysis. Proposed parallel programming solutions showed that multi-core processor and CUDA-supported GPU implementations accelerated the DFC analyses significantly. Developed algorithms make the DFC analyses more practical for multi-subject studies with more dynamic analyses.
Bibliography:ObjectType-Article-1
SourceType-Scholarly Journals-1
ObjectType-Feature-2
content type line 23
ISSN:0895-6111
1879-0771
1879-0771
DOI:10.1016/j.compmedimag.2015.02.009