Meningioma segmentation with GV-UNet: a hybrid model using a ghost module and vision transformer

Meningiomas are the most common intracranial tumors in adults. The size and shape of a tumor mostly rely on manual measurement by a neurosurgeon. In recent years, deep learning has rapidly developed and has great potential for medical image segmentation. However, most segmentation models still canno...

Full description

Saved in:

Bibliographic Details
Published in	Signal, image and video processing Vol. 18; no. 3; pp. 2377 - 2390
Main Authors	Bai, Hua, Zhang, Zhuo, Yang, Yong, Niu, Chen, Gao, Qiang, Ma, Quanfeng, Song, Jian
Format	Journal Article
Language	English
Published	London Springer London 01.04.2024 Springer Nature B.V
Subjects	Accuracy Coders Computer Imaging Computer Science Feature extraction Ghosts Image enhancement Image Processing and Computer Vision Image segmentation Machine learning Mathematical models Medical imaging Multimedia Information Systems Original Paper Parameters Pattern Recognition and Graphics Signal,Image and Speech Processing Tumors Vision Attention mechanism Transformer Meningioma Medical image segmentation
Online Access	Get full text

Cover

Loading…

More Information
Summary:	Meningiomas are the most common intracranial tumors in adults. The size and shape of a tumor mostly rely on manual measurement by a neurosurgeon. In recent years, deep learning has rapidly developed and has great potential for medical image segmentation. However, most segmentation models still cannot balance the number of parameters and accuracy. In this study, we proposed a novel segmentation network (named GV-UNet) based on a CNN and a transformer for T1-enhanced images of meningiomas to improve the accuracy and efficiency of tumor segmentation. GV-UNet uses an encoder–decoder as the main structure. In the downsampling process, features are extracted through a standard convolutional layer, and a ConvMixer Layer is used to optimize feature extraction with different sizes of meningiomas. Then, a lightweight transformer block is built to model long-range dependencies. In the final layer of the encoder, we propose an innovative Ghost-CA block, which extracts deep features via feature mapping rather than by elevating dimensionality to reduce the number of parameters. In the upsampling process, we add a SimAM that can incorporate a three-dimensional attention mechanism without increasing network parameters, effectively capturing the relationships between features and the spatial structure of the target. GV-UNet was trained and validated using numerous pathologically confirmed T1-enhanced images of meningiomas from Tianjin Huanhu Hospital. We also utilized meningioma images from the Kaggle dataset to test the robustness of the model.
ISSN:	1863-1703 1863-1711
DOI:	10.1007/s11760-023-02914-3