SwinE-UNet3+: swin transformer encoder network for medical image segmentation

A SwinE-UNet3+ model is proposed to improve the problem that convolutional neural networks cannot capture long-range feature dependencies due to the limitation of receptive field and is insensitive to contour details in tumor segmentation tasks. Each encoder layer of SwinE-UNet3+ uses two consecutiv...

Full description

Saved in:
Bibliographic Details
Published inProgress in artificial intelligence Vol. 12; no. 1; pp. 99 - 105
Main Authors Zou, Ping, Wu, Jian-Sheng
Format Journal Article
LanguageEnglish
Published Berlin/Heidelberg Springer Berlin Heidelberg 01.03.2023
Springer Nature B.V
Subjects
Online AccessGet full text

Cover

Loading…
More Information
Summary:A SwinE-UNet3+ model is proposed to improve the problem that convolutional neural networks cannot capture long-range feature dependencies due to the limitation of receptive field and is insensitive to contour details in tumor segmentation tasks. Each encoder layer of SwinE-UNet3+ uses two consecutive Swin Transformer blocks to extract features, especially long-range features in images. Patch Merging is used for down-sampling between encoder layers. The decoder uses Conv2DTranspose to perform progressive up-sampling and uses convolution operation to aggregate the decoder information after up-sampling and the encoder information through skip connection. The proposed model evaluates the TipDM Cup rectal cancer dataset and the melanoma dermoscopic image ISIC-2017 dataset. Experimental results show that SwinE-UNet3+ model outperforms UNet, UNet++ and UNet3+ models in Dice coefficient, IOU value and Precision evaluation metric.
ISSN:2192-6352
2192-6360
DOI:10.1007/s13748-023-00300-1