Asymptotic multilayer pooled transformer based strategy for medical assistance in developing countries

With the continuous progress of medical imaging technology, pathology images have become an important basis for doctors to judge the condition. However, pathology images have problems such as large numbers, large sizes, and complex backgrounds, which make the task of manual recognition exceptionally...

Full description

Saved in:
Bibliographic Details
Published inComputers & electrical engineering Vol. 119; p. 109493
Main Authors He, Keke, Li, Limiao, Zhou, Jing, Gou, Fangfang, Wu, Jia
Format Journal Article
LanguageEnglish
Published Elsevier Ltd 01.10.2024
Subjects
Online AccessGet full text

Cover

Loading…
More Information
Summary:With the continuous progress of medical imaging technology, pathology images have become an important basis for doctors to judge the condition. However, pathology images have problems such as large numbers, large sizes, and complex backgrounds, which make the task of manual recognition exceptionally difficult. The use of computer-aided diagnosis technology can improve the diagnostic accuracy of medical image recognition. Transformer model-based methods have made great progress in the field of medical image recognition, but the self-attention mechanism in them has high computational complexity. Although single-layer pooling can reduce the computational cost, the strategy is prone to feature loss, and too many self-attention operations can exacerbate the distraction problem. Based on this, this study proposes an asymptotic multilayer pooling transformer-based pathology image assistance strategy in medical decision-making systems. First, we pre-process the pathology images with denoising and enhancement to accurately capture the detailed features of the lesion region. Then the asymptotic multilayer pooling transformer pyramid structure (PMPSNet) is used to identify cell nuclei. In the self-focused module, a multilayer pooling operation is employed to simplify the sequence and efficiently capture contextual features. In addition, the strategy utilizes a pyramid encoder to obtain multi-scale feature maps and a novel multilayer perceptual structure to limit the attention interference problem. The results show that our proposed PMPSNet-L model achieves a DSC value of 0.822, which is about 1.4 % better than the second-ranked model, and the IoU value reaches up to 0.702. In addition, our PMPSNet-S maintains a relatively lightweight parameter size of about 14.49 million, which is considerably lower than the existing segmentation models. Our proposed method not only achieves higher segmentation accuracy but also has fewer model parameters and lower computational complexity.
ISSN:0045-7906
DOI:10.1016/j.compeleceng.2024.109493