Noise-induced modality-specific pretext learning for pediatric chest X-ray image classification

Deep learning (DL) has significantly advanced medical image classification. However, it often relies on transfer learning (TL) from models pretrained on large, generic non-medical image datasets like ImageNet. Conversely, medical images possess unique visual characteristics that such general models...

Full description

Saved in:

Bibliographic Details
Published in	Frontiers in artificial intelligence Vol. 7; p. 1419638
Main Authors	Rajaraman, Sivaramakrishnan, Liang, Zhaohui, Xue, Zhiyun, Antani, Sameer
Format	Journal Article
Language	English
Published	Switzerland Frontiers Media S.A 05.09.2024
Subjects	Artificial Intelligence chest radiography deep learning ensemble learning modality-specific knowledge transfer pediatric pretext learning ensemble learning chest radiography deep learning statistical significance pretext learning modality-specific knowledge transfer pediatric
Online Access	Get full text

Cover

Loading…

More Information
Summary:	Deep learning (DL) has significantly advanced medical image classification. However, it often relies on transfer learning (TL) from models pretrained on large, generic non-medical image datasets like ImageNet. Conversely, medical images possess unique visual characteristics that such general models may not adequately capture. This study examines the effectiveness of modality-specific pretext learning strengthened by image denoising and deblurring in enhancing the classification of pediatric chest X-ray (CXR) images into those exhibiting no findings, i.e., normal lungs, or with cardiopulmonary disease manifestations. Specifically, we use a architecture and leverage its encoder in conjunction with a classification head to distinguish normal from abnormal pediatric CXR findings. We benchmark this performance against the traditional TL approach, , the VGG-16 model pretrained only on ImageNet. Measures used for performance evaluation are balanced accuracy, sensitivity, specificity, F-score, Matthew's Correlation Coefficient (MCC), Kappa statistic, and Youden's index. Our findings reveal that models developed from CXR modality-specific pretext encoders substantially outperform the ImageNet-only pretrained model, , Baseline, and achieve significantly higher sensitivity ( < 0.05) with marked improvements in balanced accuracy, F-score, MCC, Kappa statistic, and Youden's index. A novel attention-based fuzzy ensemble of the pretext-learned models further improves performance across these metrics (Balanced accuracy: 0.6376; Sensitivity: 0.4991; F-score: 0.5102; MCC: 0.2783; Kappa: 0.2782, and Youden's index:0.2751), compared to Baseline (Balanced accuracy: 0.5654; Sensitivity: 0.1983; F-score: 0.2977; MCC: 0.1998; Kappa: 0.1599, and Youden's index:0.1327). The superior results of CXR modality-specific pretext learning and their ensemble underscore its potential as a viable alternative to conventional ImageNet pretraining for medical image classification. Results from this study promote further exploration of medical modality-specific TL techniques in the development of DL models for various medical imaging applications.
Bibliography:	ObjectType-Article-1 SourceType-Scholarly Journals-1 ObjectType-Feature-2 content type line 23 Reviewed by: Dulani Meedeniya, University of Moratuwa, Sri Lanka Aditya Kumar Sahu, Amrita Vishwa Vidyapeetham University, India Edited by: Cornelio Yáñez-Márquez, National Polytechnic Institute (IPN), Mexico
ISSN:	2624-8212 2624-8212
DOI:	10.3389/frai.2024.1419638