Improving real-time detection of laryngeal lesions in endoscopic images using a decoupled super-resolution enhanced YOLO

Laryngeal Cancer (LC) constitutes approximately one third of head and neck cancers. Detecting early-stage lesions in this anatomical region is crucial for achieving a high survival rate. However, it poses significant diagnostic challenges owing to the varied appearance of lesions and the need for pr...

Full description

Saved in:
Bibliographic Details
Published inComputer methods and programs in biomedicine Vol. 260; p. 108539
Main Authors Baldini, Chiara, Migliorelli, Lucia, Berardini, Daniele, Azam, Muhammad Adeel, Sampieri, Claudio, Ioppi, Alessandro, Srivastava, Rakesh, Peretti, Giorgio, Mattos, Leonardo S.
Format Journal Article
LanguageEnglish
Published Ireland Elsevier B.V 01.03.2025
Subjects
Online AccessGet full text

Cover

Loading…
More Information
Summary:Laryngeal Cancer (LC) constitutes approximately one third of head and neck cancers. Detecting early-stage lesions in this anatomical region is crucial for achieving a high survival rate. However, it poses significant diagnostic challenges owing to the varied appearance of lesions and the need for precise characterization for appropriate clinical management. Conventional diagnostic approaches rely heavily on endoscopic examination, which often requires expert interpretation and may be limited by subjective assessment. Deep learning (DL) approaches offer promising opportunities for automating lesion detection, but their efficacy in handling multi-modal imaging data and accurately localizing small lesions remains a subject of investigation. Furthermore, the clinical domain may largely benefit from the deployment of efficient DL methods that can ensure equitable access to advanced technologies, regardless of the availability of resources that can often be limited. In this study, a DL-based approach, named SRE-YOLO, was introduced to provide real-time assistance to less-experienced personnel during laryngeal assessment, by automatically detecting lesions at different scales from endoscopic White Light (WL) and Narrow-Band Imaging (NBI) images. During the training, the SRE-YOLO integrates a YOLOv8 nano (YOLOv8n) baseline with a Super-Resolution (SR) branch to enhance lesion detection. This last component is decoupled during inference to preserve the low computational demand of the YOLOv8n baseline. The evaluation was conducted on a multi-center dataset, encompassing diverse laryngeal pathologies and acquisition modalities. The SRE-YOLO method improved the Average Precision (AP@IoU=0.5) in lesion detection by 5% with respect to the YOLOv8n baseline, while maintaining the inference speed of 58.8 Frames Per Second (FPS). Comparative analyses against state-of-the-art DL methods highlighted the efficacy of the SRE-YOLO approach in balancing detection accuracy, computational efficiency, and real-time applicability. This research underscores the potential of SRE-YOLO in developing efficient DL-driven decision support systems for real-time detection of laryngeal lesions at different scales from both WL and NBI endoscopic data. •Deep learning shows promising prospects for laryngeal lesion early detection in endoscopy.•Detecting small laryngeal lesions holds a significant challenge.•Embedding a super resolution branch in YOLO boosts lesion detection during training.•Decoupling the super resolution branch in testing ensures the real-time applicability.
Bibliography:ObjectType-Article-1
SourceType-Scholarly Journals-1
ObjectType-Feature-2
content type line 23
ISSN:0169-2607
1872-7565
1872-7565
DOI:10.1016/j.cmpb.2024.108539