Spatiotemporal Modeling Encounters 3D Medical Image Analysis: Slice-Shift UNet with Multi-View Fusion
As a fundamental part of computational healthcare, Computer Tomography (CT) and Magnetic Resonance Imaging (MRI) provide volumetric data, making the development of algorithms for 3D image analysis a necessity. Despite being computationally cheap, 2D Convolutional Neural Networks can only extract spa...
Saved in:
Main Authors | , , |
---|---|
Format | Journal Article |
Language | English |
Published |
24.07.2023
|
Subjects | |
Online Access | Get full text |
Cover
Loading…
Summary: | As a fundamental part of computational healthcare, Computer Tomography (CT)
and Magnetic Resonance Imaging (MRI) provide volumetric data, making the
development of algorithms for 3D image analysis a necessity. Despite being
computationally cheap, 2D Convolutional Neural Networks can only extract
spatial information. In contrast, 3D CNNs can extract three-dimensional
features, but they have higher computational costs and latency, which is a
limitation for clinical practice that requires fast and efficient models.
Inspired by the field of video action recognition we propose a new 2D-based
model dubbed Slice SHift UNet (SSH-UNet) which encodes three-dimensional
features at 2D CNN's complexity. More precisely multi-view features are
collaboratively learned by performing 2D convolutions along the three
orthogonal planes of a volume and imposing a weights-sharing mechanism. The
third dimension, which is neglected by the 2D convolution, is reincorporated by
shifting a portion of the feature maps along the slices' axis. The
effectiveness of our approach is validated in Multi-Modality Abdominal
Multi-Organ Segmentation (AMOS) and Multi-Atlas Labeling Beyond the Cranial
Vault (BTCV) datasets, showing that SSH-UNet is more efficient while on par in
performance with state-of-the-art architectures. |
---|---|
DOI: | 10.48550/arxiv.2307.12853 |