Fast GPU 3D diffeomorphic image registration

3D image registration is one of the most fundamental and computationally expensive operations in medical image analysis. Here, we present a mixed-precision, Gauss–Newton–Krylov solver for diffeomorphic registration of two images. Our work extends the publicly available CLAIRE library to GPU architec...

Full description

Saved in:
Bibliographic Details
Published inJournal of parallel and distributed computing Vol. 149; no. C; pp. 149 - 162
Main Authors Brunn, Malte, Himthani, Naveen, Biros, George, Mehl, Miriam, Mang, Andreas
Format Journal Article
LanguageEnglish
Published United States Elsevier Inc 01.03.2021
Elsevier
Subjects
Online AccessGet full text

Cover

Loading…
More Information
Summary:3D image registration is one of the most fundamental and computationally expensive operations in medical image analysis. Here, we present a mixed-precision, Gauss–Newton–Krylov solver for diffeomorphic registration of two images. Our work extends the publicly available CLAIRE library to GPU architectures. Despite the importance of image registration, only a few implementations of large deformation diffeomorphic registration packages support GPUs. Our contributions are new algorithms to significantly reduce the run time of the two main computational kernels in CLAIRE: calculation of derivatives and scattered-data interpolation. We deploy (i) highly-optimized, mixed-precision GPU-kernels for the evaluation of scattered-data interpolation, (ii) replace Fast-Fourier-Transform (FFT)-based first-order derivatives with optimized 8th-order finite differences, and (iii) compare with state-of-the-art CPU and GPU implementations. As a highlight, we demonstrate that we can register 2563 clinical images in less than 6 s on a single NVIDIA Tesla V100. This amounts to over 20× speed-up over the current version of CLAIRE and over 30× speed-up over existing GPU implementations. •The LDDMM software CLAIRE is ported to GPU.•Compute intensive kernels are optimized.•A mixed-precision approach with Fast-Fourier-Transforms and finite differences is used.•Hardware acceleration is used for linear and cubic interpolations.•Clinical images can be registered in less than 6 seconds.
Bibliography:ObjectType-Article-1
SourceType-Scholarly Journals-1
ObjectType-Feature-2
content type line 23
USDOE Office of Science (SC), Advanced Scientific Computing Research (ASCR)
NA0003969; SC0019393; DMS-1854853; DMS-2009923; DMS-2012825; CCF-1817048; CCF-172574; FA9550-17-1-0190; 5R01NS042645-11A1
National Science Foundation (NSF)
National Institutes of Health (NIH)
USDOE National Nuclear Security Administration (NNSA)
US Air Force Office of Scientific Research (AFOSR)
Malte Brunn: methodology, software, investigation, writing (original draft and review/editing), visualization; Naveen Himthani: methodology, software, investigation, writing (original draft and review/editing), visualization; Andreas Mang: methodology, software, investigation, writing (original draft and review/editing), visualization, supervision, funding acquisition. George Biros: methodology, writing (original draft and review/editing)supervision, project administration, funding acquisition. Miriam Mehl: methodology, writing (original draft and review/editing)supervision, project administration, funding acquisition
Author Statement
ISSN:0743-7315
1096-0848
DOI:10.1016/j.jpdc.2020.11.006