Improving Accessibility for the Visually Impaired with Enhanced Multi-class Object Detection Using YOLOv8L, YOLO11x, and Faster R-CNN with Diverse Backbones
Advances in deep learning and computer vision have revolutionized object detection, enabling real-time and accurate object recognition. These object detection technologies can potentially transform accessibility solutions, especially for individuals with visual impairments. This study aims to enhanc...
Saved in:
Published in | Journal of Disability Research Vol. 4; no. 4 |
---|---|
Main Authors | , , |
Format | Journal Article |
Language | English |
Published |
01.09.2025
|
Online Access | Get full text |
Cover
Loading…
Abstract | Advances in deep learning and computer vision have revolutionized object detection, enabling real-time and accurate object recognition. These object detection technologies can potentially transform accessibility solutions, especially for individuals with visual impairments. This study aims to enhance accessibility and environment-effective interaction for individuals with visual disabilities by detecting and naming objects in real-world environments. This study examines and optimizes the potential of a set of developed deep learning models, including YOLOv8L, YOLO11x, and Faster region-based convolutional neural network (R-CNN) with seven backbone models for multi-class object detection to enhance object recognition and provide auditory feedback; these models aim to bridge the gap between the visually impaired and their surroundings. In addition, we attempt to propose a system that translates detections into audible descriptions, empowering individuals to navigate and interact with the world independently by integrating object detection with text-to-speech (TTS) technology. The models leverage Arabic-translated PASCAL VOC 2007 and 2012 datasets, with performance evaluated through precision, recall, and mean average precision (mAP). The results revealed that YOLO11x achieves the highest mAP of 0.86, followed by YOLOv8L with an mAP of 0.83. Faster R-CNN with EfficientNet-B3, HRNet-w32, and MobileNetV3-Large showed the highest accuracy among other backbones with 79%, 78%, and 75%, respectively. The study proves the efficacy of deep learning models in accessibility applications as assistive technologies for individuals with visual impairments and highlights opportunities for future development. |
---|---|
AbstractList | Advances in deep learning and computer vision have revolutionized object detection, enabling real-time and accurate object recognition. These object detection technologies can potentially transform accessibility solutions, especially for individuals with visual impairments. This study aims to enhance accessibility and environment-effective interaction for individuals with visual disabilities by detecting and naming objects in real-world environments. This study examines and optimizes the potential of a set of developed deep learning models, including YOLOv8L, YOLO11x, and Faster region-based convolutional neural network (R-CNN) with seven backbone models for multi-class object detection to enhance object recognition and provide auditory feedback; these models aim to bridge the gap between the visually impaired and their surroundings. In addition, we attempt to propose a system that translates detections into audible descriptions, empowering individuals to navigate and interact with the world independently by integrating object detection with text-to-speech (TTS) technology. The models leverage Arabic-translated PASCAL VOC 2007 and 2012 datasets, with performance evaluated through precision, recall, and mean average precision (mAP). The results revealed that YOLO11x achieves the highest mAP of 0.86, followed by YOLOv8L with an mAP of 0.83. Faster R-CNN with EfficientNet-B3, HRNet-w32, and MobileNetV3-Large showed the highest accuracy among other backbones with 79%, 78%, and 75%, respectively. The study proves the efficacy of deep learning models in accessibility applications as assistive technologies for individuals with visual impairments and highlights opportunities for future development. |
Author | Albuhairy, Mohammad Mahyoob Algaraady, Jeehaan Khan, Mohammad Zubair |
Author_xml | – sequence: 1 givenname: Jeehaan orcidid: 0000-0003-3901-4648 surname: Algaraady fullname: Algaraady, Jeehaan – sequence: 2 givenname: Mohammad Mahyoob orcidid: 0000-0002-6664-1017 surname: Albuhairy fullname: Albuhairy, Mohammad Mahyoob – sequence: 3 givenname: Mohammad Zubair orcidid: 0000-0002-2409-7172 surname: Khan fullname: Khan, Mohammad Zubair |
BookMark | eNotkF1PwjAUhhuDiYjcet0fQLHt1q67RD4UM1lCxMSrpetaKY6NtAPlv_hjHWDOxXve5ORJznMLOlVdaQDuCR6yiMTRw8tkiSimDGEe0ivQpTziiPIg6IAu4UygOCb0BvS932CMg4CEOGJd8Dvf7lx9sNUnHCmlvbe5LW1zhKZ2sFlr-G79XpblEbaH0jpdwG_brOG0WstKte11XzYWqVJ6D9N8o1UDJ7ppw9YVXPkT-CNN0oNIBueFkJ8BlFUBZ9I32sElGi8WF-bEHrTzGj5K9ZW37_k7cG1k6XX_P3tgNZu-jZ9Rkj7Nx6MEKRKHFIk4kpxFylCBC4k5ZoWMQkFzHuZCqKIdEzKa0zjXUSyMIdJgrhVjRkip8qAHhheucrX3Tpts5-xWumNGcHbWm7V6s5Pe7KQ3-AMdYHBy |
Cites_doi | 10.1007/s11263-009-0275-4 10.1016/j.patcog.2018.01.027 10.1088/1742-6596/1004/1/012029 10.48550/arXiv.1506.01497 10.1109/IC3IoT60841.2024.10550247 10.1109/NMITCON58196.2023.10276255 10.3390/e22090941 10.1109/ICSPIS60075.2023.10344272 10.1109/ACCESS.2020.2966651 10.36948/ijfmr.2024.v06i05.27133 10.1109/ICECA58529.2023.10394723 10.1109/ACCESS.2023.3287147 10.37391/ijeer.100205 10.1109/ICACRS58579.2023.10404820 10.20895/infotel.v16i3.1189 10.1109/ASIANCON58793.2023.10269899 10.24235/itej.v8i2.123 10.22541/au.168788163.30701797/v1 10.1109/DISCOVER52564.2021.9663608 10.1109/CVPR.2016.91 10.1109/CVPR.2018.00114 10.1109/ICCONS.2018.8663016 |
ContentType | Journal Article |
DBID | AAYXX CITATION |
DOI | 10.57197/JDR-2025-0642 |
DatabaseName | CrossRef |
DatabaseTitle | CrossRef |
DatabaseTitleList | CrossRef |
DeliveryMethod | fulltext_linktorsrc |
EISSN | 2676-2633 |
ExternalDocumentID | 10_57197_JDR_2025_0642 |
GroupedDBID | AAYXX ABDBF ALMA_UNASSIGNED_HOLDINGS CITATION ESX GROUPED_DOAJ |
ID | FETCH-LOGICAL-c1942-897a657cf280da0605da7482b64b88cdcdcf452b29be798ff1af06ec55f8aacb3 |
ISSN | 1658-9912 |
IngestDate | Thu Aug 14 00:02:06 EDT 2025 |
IsDoiOpenAccess | false |
IsOpenAccess | true |
IsPeerReviewed | true |
IsScholarly | true |
Issue | 4 |
Language | English |
License | https://creativecommons.org/licenses/by/4.0 |
LinkModel | OpenURL |
MergedId | FETCHMERGED-LOGICAL-c1942-897a657cf280da0605da7482b64b88cdcdcf452b29be798ff1af06ec55f8aacb3 |
ORCID | 0000-0002-6664-1017 0000-0003-3901-4648 0000-0002-2409-7172 |
OpenAccessLink | https://www.scienceopen.com/hosted-document?doi=10.57197/JDR-2025-0642 |
ParticipantIDs | crossref_primary_10_57197_JDR_2025_0642 |
PublicationCentury | 2000 |
PublicationDate | 2025-09 |
PublicationDateYYYYMMDD | 2025-09-01 |
PublicationDate_xml | – month: 09 year: 2025 text: 2025-09 |
PublicationDecade | 2020 |
PublicationTitle | Journal of Disability Research |
PublicationYear | 2025 |
References | S Shanmugapriya (r26) 2023 L Bougheloum (r3) 2023 J Du (r8) 2018; 1004 L Nkalubo (r21) 2023 R Soniya (r27) 2020; 4 K Lee (r18) 2018 M Hussan (r12) 2022; 10 R Joshi (r15) 2020; 22 S Kadam (r16) 2024; 6 C Sagana (r25) 2021 M Everingham (r9) 2010; 88 G He (r11) 2020; 8 CT Patel (r22) 2018 D Das (r7) 2024 AR Jambhulkar (r13) 2023 J Redmon (r23) 2016 DD Aboyomi (r1) 2023; 8 Z Kuang (r17) 2018; 78 S Joseph (r14) 2023 S Ren (r24) 2015; 28 M Hanif (r10) 2024; 16 Y Nitta (r20) 2023; 11 K Masal (r19) 2023 |
References_xml | – volume: 88 start-page: 303 issue: 2 year: 2010 ident: r9 article-title: The PASCAL Visual Object Classes (VOC) challenge publication-title: Int J Comput Vis doi: 10.1007/s11263-009-0275-4 – volume: 78 start-page: 198 year: 2018 ident: r17 article-title: Integrating multi-level deep learning and concept ontology for large-scale visual recognition publication-title: Pattern Recognit doi: 10.1016/j.patcog.2018.01.027 – volume: 1004 year: 2018 ident: r8 article-title: Understanding of object detection based on CNN family and YOLO publication-title: J Phys Conf Ser doi: 10.1088/1742-6596/1004/1/012029 – volume: 28 start-page: 91 year: 2015 ident: r24 article-title: Faster R-CNN: Towards real-time object detection with region proposal networks publication-title: Adv Neural Inf Process Syst doi: 10.48550/arXiv.1506.01497 – start-page: 1 year: 2024 ident: r7 article-title: Object detection with voice output for visually impaired doi: 10.1109/IC3IoT60841.2024.10550247 – start-page: 1 year: 2023 ident: r14 article-title: Object detection and localization for visually impaired people doi: 10.1109/NMITCON58196.2023.10276255 – volume: 22 issue: 9 year: 2020 ident: r15 article-title: Efficient multi-object detection and smart navigation using artificial intelligence for visually impaired people publication-title: Entropy doi: 10.3390/e22090941 – start-page: 51 year: 2023 ident: r3 article-title: Real-time obstacle detection for visually impaired people using deep learning doi: 10.1109/ICSPIS60075.2023.10344272 – volume: 8 start-page: 1543615447 year: 2020 ident: r11 article-title: Feature selection-based hierarchical deep network for image classification publication-title: IEEE Access doi: 10.1109/ACCESS.2020.2966651 – volume: 6 issue: 5 year: 2024 ident: r16 article-title: Advancements in image detection: a comprehensive approach to object localization and classification using deep learning techniques publication-title: Int J Multidiscip Res doi: 10.36948/ijfmr.2024.v06i05.27133 – start-page: 658 year: 2023 ident: r19 article-title: Deep learning attentional dense based indoor object recognition for visually impaired people doi: 10.1109/ICECA58529.2023.10394723 – volume: 11 start-page: 62932 year: 2023 ident: r20 article-title: Importance of learning of objects in urban scenes for assisting visually impaired people publication-title: IEEE Access doi: 10.1109/ACCESS.2023.3287147 – volume: 4 start-page: 264 issue: 5 year: 2020 ident: r27 article-title: Text and object recognition using deep learning for visually impaired people publication-title: Int J Trend Sci Res Develop – volume: 10 start-page: 80 issue: 2 year: 2022 ident: r12 article-title: Object detection and recognition in real time using deep learning for visually impaired people publication-title: Int J Electr Electron Res doi: 10.37391/ijeer.100205 – start-page: 1757 year: 2023 ident: r26 article-title: Audio assist: Enabling object detection through speech for the visually impaired doi: 10.1109/ICACRS58579.2023.10404820 – volume: 16 start-page: 502 issue: 3 year: 2024 ident: r10 article-title: Rupiah banknotes detection comparison of the faster R-CNN algorithm and YOLOv5 publication-title: J Infotel doi: 10.20895/infotel.v16i3.1189 – start-page: 1 year: 2023 ident: r13 article-title: Real-time object detection and audio feedback for the visually impaired doi: 10.1109/ASIANCON58793.2023.10269899 – volume: 8 start-page: 96 issue: 2 year: 2023 ident: r1 article-title: A comparative analysis of modern object detection algorithms: YOLO vs. SSD vs. faster R-CNN publication-title: Inf Technol Eng J doi: 10.24235/itej.v8i2.123 – year: 2023 ident: r21 article-title: Real-time object detection using an ensemble of one-stage and two-stage object detection models with dynamic fine-tuning using Kullback–Leibler divergence publication-title: Authorea doi: 10.22541/au.168788163.30701797/v1 – start-page: 318 year: 2021 ident: r25 article-title: Object recognition system for visually impaired people doi: 10.1109/DISCOVER52564.2021.9663608 – start-page: 779 year: 2016 ident: r23 article-title: You only look once: Unified, real-time object detection doi: 10.1109/CVPR.2016.91 – start-page: 1034 year: 2018 ident: r18 article-title: Hierarchical novelty detection for visual object recognition doi: 10.1109/CVPR.2018.00114 – start-page: 1 year: 2018 ident: r22 article-title: Multisensor-based object detection in indoor environment for visually impaired people doi: 10.1109/ICCONS.2018.8663016 |
SSID | ssj0003314075 |
Score | 2.301953 |
Snippet | Advances in deep learning and computer vision have revolutionized object detection, enabling real-time and accurate object recognition. These object detection... |
SourceID | crossref |
SourceType | Index Database |
Title | Improving Accessibility for the Visually Impaired with Enhanced Multi-class Object Detection Using YOLOv8L, YOLO11x, and Faster R-CNN with Diverse Backbones |
Volume | 4 |
hasFullText | 1 |
inHoldings | 1 |
isFullTextHit | |
isPrint | |
link | http://utb.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwnZ1fa9swEMBF1r3sZbRsY3-LHgZ7SLXZimzZj13TUkKbQGlHtxdzkmVc1iajS0qzz7LPsc-3k2Qr7sigGwFj5Oji-H7o7uTTiZC3poRcRjphSpYDJsBopowAjFrRFkrItHKVmI7H6eGZGJ0n573er07W0mKu3usfa9eV_I9WsQ31alfJ_oNmg1BswHPULx5Rw3i8l45XMwK7bt9Dn-m6DKmDny6-L-DycmlrAMNFyDTfn9b-vb9bfcu0daD7E2VnZHD8mRu_e7hPJvg8OZrcZEdupMbTOL5t8z0PwNZY6J-wvfHYyx26HA_T_wj6q7J7APzF8x02dX2d_9-ZTbPc2ZUlAH7kHxlTw4pe-zahxr_hrh3Pari6ApskUi9nMxUMR-1ndMP1LwuFfbqTGzwJ2VvteIwOEkMX1g_YxrXxVKaMp75-RjuIiw6rYp1pSGTsag6PhifM_1Iq-MoIti_-_7CNIWMRYyUnocD-he1f2P4PyEOO4QnvhPLWAxgMMGx1NZ7D_ft6oU7Ehzu30PGHOo7N6SZ53OiF7nq8tkjPTJ-QnwEtegctimhRRIu2aNEWLWoRoC1atIMW9WjRgBZ1aNEGrR3agLVDESvqsaIOKy-zwYoGrJ6Ss4P9071D1mzkwXScC7S4uYQ0kbriWVRChBF0CVJkXKVCZZku8VOJhCueKyPzrKpiqKLU6CSpMgCtBs_IxhTlPyfUmDyquFAYaOPD4wlI9Me5VJmpTFlK9YK8ax9n8c3XaynW6-7lvb_5ijxasfmabMyvF-YNuqJzte30_hvxE4tZ |
linkProvider | Directory of Open Access Journals |
openUrl | ctx_ver=Z39.88-2004&ctx_enc=info%3Aofi%2Fenc%3AUTF-8&rfr_id=info%3Asid%2Fsummon.serialssolutions.com&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Ajournal&rft.genre=article&rft.atitle=Improving+Accessibility+for+the+Visually+Impaired+with+Enhanced+Multi-class+Object+Detection+Using+YOLOv8L%2C+YOLO11x%2C+and+Faster+R-CNN+with+Diverse+Backbones&rft.jtitle=Journal+of+Disability+Research&rft.au=Algaraady%2C+Jeehaan&rft.au=Albuhairy%2C+Mohammad+Mahyoob&rft.au=Khan%2C+Mohammad+Zubair&rft.date=2025-09-01&rft.issn=1658-9912&rft.eissn=2676-2633&rft.volume=4&rft.issue=4&rft_id=info:doi/10.57197%2FJDR-2025-0642&rft.externalDBID=n%2Fa&rft.externalDocID=10_57197_JDR_2025_0642 |
thumbnail_l | http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/lc.gif&issn=1658-9912&client=summon |
thumbnail_m | http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/mc.gif&issn=1658-9912&client=summon |
thumbnail_s | http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/sc.gif&issn=1658-9912&client=summon |