Noise-induced modality-specific pretext learning for pediatric chest X-ray image classification

Deep learning (DL) has significantly advanced medical image classification. However, it often relies on transfer learning (TL) from models pretrained on large, generic non-medical image datasets like ImageNet. Conversely, medical images possess unique visual characteristics that such general models...

Full description

Saved in:
Bibliographic Details
Published inFrontiers in artificial intelligence Vol. 7; p. 1419638
Main Authors Rajaraman, Sivaramakrishnan, Liang, Zhaohui, Xue, Zhiyun, Antani, Sameer
Format Journal Article
LanguageEnglish
Published Switzerland Frontiers Media S.A 05.09.2024
Subjects
Online AccessGet full text

Cover

Loading…
More Information
Summary:Deep learning (DL) has significantly advanced medical image classification. However, it often relies on transfer learning (TL) from models pretrained on large, generic non-medical image datasets like ImageNet. Conversely, medical images possess unique visual characteristics that such general models may not adequately capture. This study examines the effectiveness of modality-specific pretext learning strengthened by image denoising and deblurring in enhancing the classification of pediatric chest X-ray (CXR) images into those exhibiting no findings, i.e., normal lungs, or with cardiopulmonary disease manifestations. Specifically, we use a architecture and leverage its encoder in conjunction with a classification head to distinguish normal from abnormal pediatric CXR findings. We benchmark this performance against the traditional TL approach, , the VGG-16 model pretrained only on ImageNet. Measures used for performance evaluation are balanced accuracy, sensitivity, specificity, F-score, Matthew's Correlation Coefficient (MCC), Kappa statistic, and Youden's index. Our findings reveal that models developed from CXR modality-specific pretext encoders substantially outperform the ImageNet-only pretrained model, , Baseline, and achieve significantly higher sensitivity (  < 0.05) with marked improvements in balanced accuracy, F-score, MCC, Kappa statistic, and Youden's index. A novel attention-based fuzzy ensemble of the pretext-learned models further improves performance across these metrics (Balanced accuracy: 0.6376; Sensitivity: 0.4991; F-score: 0.5102; MCC: 0.2783; Kappa: 0.2782, and Youden's index:0.2751), compared to Baseline (Balanced accuracy: 0.5654; Sensitivity: 0.1983; F-score: 0.2977; MCC: 0.1998; Kappa: 0.1599, and Youden's index:0.1327). The superior results of CXR modality-specific pretext learning and their ensemble underscore its potential as a viable alternative to conventional ImageNet pretraining for medical image classification. Results from this study promote further exploration of medical modality-specific TL techniques in the development of DL models for various medical imaging applications.
Bibliography:ObjectType-Article-1
SourceType-Scholarly Journals-1
ObjectType-Feature-2
content type line 23
Reviewed by: Dulani Meedeniya, University of Moratuwa, Sri Lanka
Aditya Kumar Sahu, Amrita Vishwa Vidyapeetham University, India
Edited by: Cornelio Yáñez-Márquez, National Polytechnic Institute (IPN), Mexico
ISSN:2624-8212
2624-8212
DOI:10.3389/frai.2024.1419638