TF-YOLO: A Transformer–Fusion-Based YOLO Detector for Multimodal Pedestrian Detection in Autonomous Driving Scenes

Recent research demonstrates that the fusion of multimodal images can improve the performance of pedestrian detectors under low-illumination environments. However, existing multimodal pedestrian detectors cannot adapt to the variability of environmental illumination. When the lighting conditions of...

Full description

Saved in:
Bibliographic Details
Published inWorld electric vehicle journal Vol. 14; no. 12; p. 352
Main Authors Chen, Yunfan, Ye, Jinxing, Wan, Xiangkui
Format Journal Article
LanguageEnglish
Published Basel MDPI AG 01.12.2023
Subjects
Online AccessGet full text

Cover

Loading…
Abstract Recent research demonstrates that the fusion of multimodal images can improve the performance of pedestrian detectors under low-illumination environments. However, existing multimodal pedestrian detectors cannot adapt to the variability of environmental illumination. When the lighting conditions of the application environment do not match the experimental data illumination conditions, the detection performance is likely to be stuck significantly. To resolve this problem, we propose a novel transformer–fusion-based YOLO detector to detect pedestrians under various illumination environments, such as nighttime, smog, and heavy rain. Specifically, we develop a novel transformer–fusion module embedded in a two-stream backbone network to robustly integrate the latent interactions between multimodal images (visible and infrared images). This enables the multimodal pedestrian detector to adapt to changing illumination conditions. Experimental results on two well-known datasets demonstrate that the proposed approach exhibits superior performance. The proposed TF-YOLO drastically improves the average precision of the state-of-the-art approach by 3.3% and reduces the miss rate of the state-of-the-art approach by about 6% on the challenging multi-scenario multi-modality dataset.
AbstractList Recent research demonstrates that the fusion of multimodal images can improve the performance of pedestrian detectors under low-illumination environments. However, existing multimodal pedestrian detectors cannot adapt to the variability of environmental illumination. When the lighting conditions of the application environment do not match the experimental data illumination conditions, the detection performance is likely to be stuck significantly. To resolve this problem, we propose a novel transformer–fusion-based YOLO detector to detect pedestrians under various illumination environments, such as nighttime, smog, and heavy rain. Specifically, we develop a novel transformer–fusion module embedded in a two-stream backbone network to robustly integrate the latent interactions between multimodal images (visible and infrared images). This enables the multimodal pedestrian detector to adapt to changing illumination conditions. Experimental results on two well-known datasets demonstrate that the proposed approach exhibits superior performance. The proposed TF-YOLO drastically improves the average precision of the state-of-the-art approach by 3.3% and reduces the miss rate of the state-of-the-art approach by about 6% on the challenging multi-scenario multi-modality dataset.
Author Chen, Yunfan
Ye, Jinxing
Wan, Xiangkui
Author_xml – sequence: 1
  givenname: Yunfan
  orcidid: 0000-0003-4808-6352
  surname: Chen
  fullname: Chen, Yunfan
– sequence: 2
  givenname: Jinxing
  surname: Ye
  fullname: Ye, Jinxing
– sequence: 3
  givenname: Xiangkui
  surname: Wan
  fullname: Wan, Xiangkui
BookMark eNptkc1uUzEQhS3USpS2Ox7AElsu-D_X7EJLoFKqIBEWrKzJtV05urGL7duKHe_AG_IkOKRIFepqRqPvHJ2ZeYGOYooOoZeUvOFck7f37m5LBWWES_YMnbTKOqUkP3rUP0fnpWwJIYwKTSk9QXW96L6tlqt3eI7XGWLxKe9c_v3z12IqIcXuPRRn8R7Bl666oaaMG4Ovp7GGXbIw4s_OulJzgPiANB0OEc-nmmLapangyxzuQrzBXwYXXTlDxx7G4s4f6in6uviwvvjULVcfry7my24QXNTOzhT1RLGZbdGtBwFaSio4IxvhpadK9sQK8JuZHqztVc8Z94pabp2VG8X4Kbo6-NoEW3Obww7yD5MgmL-DlG8M5BqG0RmtvNTCSk11L1STc6e1AGWVB6VBNq9XB6_bnL5PbV-zTVOOLb5hmgjNe8Z5o9iBGnIqJTtvhlBhf5CaIYyGErN_lnn8rCZ6_Z_oX9Qn8T9bMJgL
CitedBy_id crossref_primary_10_1109_ACCESS_2025_3526458
crossref_primary_10_3390_fire8020038
crossref_primary_10_3390_s24072080
crossref_primary_10_1016_j_engappai_2024_109705
crossref_primary_10_3390_s25051375
Cites_doi 10.1016/j.isprsjprs.2019.02.005
10.1109/WACV48630.2021.00012
10.1109/CVPR52729.2023.00721
10.1109/TITS.2020.2993926
10.1109/CVPR.2019.00075
10.1109/CVPR52688.2022.00098
10.1109/CVPR.2015.7298706
10.5244/C.30.73
10.1145/3126686.3126727
10.1049/iet-cvi.2018.5315
10.23919/MVA51890.2021.9511366
10.3390/ijgi8050226
10.1109/CVPR52688.2022.00571
10.1016/j.asoc.2023.110768
10.1016/j.patcog.2018.08.005
10.1109/TITS.2023.3281393
10.1364/JOSAA.386410
10.1007/s11063-022-10991-7
10.1109/WACV51458.2022.00339
ContentType Journal Article
Copyright 2023 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/). Notwithstanding the ProQuest Terms and Conditions, you may use this content in accordance with the terms of the License.
Copyright_xml – notice: 2023 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/). Notwithstanding the ProQuest Terms and Conditions, you may use this content in accordance with the terms of the License.
DBID AAYXX
CITATION
8FE
8FG
ABJCF
ABUWG
AFKRA
AZQEC
BENPR
BGLVJ
CCPQU
DWQXO
HCIFZ
L6V
M7S
PHGZM
PHGZT
PIMPY
PKEHL
PQEST
PQGLB
PQQKQ
PQUKI
PRINS
PTHSS
DOA
DOI 10.3390/wevj14120352
DatabaseName CrossRef
ProQuest SciTech Collection
ProQuest Technology Collection
ProQuest SciTech Premium Collection Technology Collection Materials Science & Engineering Database
ProQuest Central (Alumni)
ProQuest Central UK/Ireland
ProQuest Central Essentials
ProQuest Central
ProQuest Technology Collection
ProQuest One
ProQuest Central Korea
SciTech Premium Collection
ProQuest Engineering Collection
Engineering Database
ProQuest Central Premium
ProQuest One Academic (New)
ProQuest Publicly Available Content Database
ProQuest One Academic Middle East (New)
ProQuest One Academic Eastern Edition (DO NOT USE)
ProQuest One Applied & Life Sciences
ProQuest One Academic
ProQuest One Academic UKI Edition
ProQuest Central China
Engineering Collection
DOAJ Directory of Open Access Journals
DatabaseTitle CrossRef
Publicly Available Content Database
Engineering Database
Technology Collection
ProQuest One Academic Middle East (New)
ProQuest Central Essentials
ProQuest One Academic Eastern Edition
ProQuest Central (Alumni Edition)
SciTech Premium Collection
ProQuest One Community College
ProQuest Technology Collection
ProQuest SciTech Collection
ProQuest Central China
ProQuest Central
ProQuest One Applied & Life Sciences
ProQuest Engineering Collection
ProQuest One Academic UKI Edition
ProQuest Central Korea
Materials Science & Engineering Collection
ProQuest Central (New)
ProQuest One Academic
ProQuest One Academic (New)
Engineering Collection
DatabaseTitleList CrossRef

Publicly Available Content Database
Database_xml – sequence: 1
  dbid: DOA
  name: DOAJ Directory of Open Access Journals
  url: https://www.doaj.org/
  sourceTypes: Open Website
– sequence: 2
  dbid: 8FG
  name: ProQuest Technology Collection
  url: https://search.proquest.com/technologycollection1
  sourceTypes: Aggregation Database
DeliveryMethod fulltext_linktorsrc
Discipline Engineering
EISSN 2032-6653
ExternalDocumentID oai_doaj_org_article_96f594d5919846d5b3e994a6d6fa69a5
10_3390_wevj14120352
GroupedDBID AADQD
AAFWJ
AAYXX
ABJCF
ADBBV
AFKRA
AFPKN
AFZYC
ALMA_UNASSIGNED_HOLDINGS
ARCSS
BCNDV
BENPR
BGLVJ
CCPQU
CITATION
GROUPED_DOAJ
HCIFZ
IAO
ICD
ITC
M7S
MODMG
M~E
OK1
PHGZM
PHGZT
PIMPY
PTHSS
8FE
8FG
ABUWG
AZQEC
DWQXO
L6V
PKEHL
PQEST
PQGLB
PQQKQ
PQUKI
PRINS
PUEGO
ID FETCH-LOGICAL-c434t-d761f0627d653dfa4a95514320b4f5f16580d4afb79cdd868323f61d3ded5b623
IEDL.DBID BENPR
ISSN 2032-6653
IngestDate Wed Aug 27 01:32:00 EDT 2025
Fri Jul 25 09:35:32 EDT 2025
Thu Apr 24 22:58:20 EDT 2025
Tue Jul 01 00:22:53 EDT 2025
IsDoiOpenAccess true
IsOpenAccess true
IsPeerReviewed true
IsScholarly true
Issue 12
Language English
License https://creativecommons.org/licenses/by/4.0
LinkModel DirectLink
MergedId FETCHMERGED-LOGICAL-c434t-d761f0627d653dfa4a95514320b4f5f16580d4afb79cdd868323f61d3ded5b623
Notes ObjectType-Article-1
SourceType-Scholarly Journals-1
ObjectType-Feature-2
content type line 14
ORCID 0000-0003-4808-6352
OpenAccessLink https://www.proquest.com/docview/2904938233?pq-origsite=%requestingapplication%
PQID 2904938233
PQPubID 5046847
ParticipantIDs doaj_primary_oai_doaj_org_article_96f594d5919846d5b3e994a6d6fa69a5
proquest_journals_2904938233
crossref_citationtrail_10_3390_wevj14120352
crossref_primary_10_3390_wevj14120352
ProviderPackageCode CITATION
AAYXX
PublicationCentury 2000
PublicationDate 2023-12-01
PublicationDateYYYYMMDD 2023-12-01
PublicationDate_xml – month: 12
  year: 2023
  text: 2023-12-01
  day: 01
PublicationDecade 2020
PublicationPlace Basel
PublicationPlace_xml – name: Basel
PublicationTitle World electric vehicle journal
PublicationYear 2023
Publisher MDPI AG
Publisher_xml – name: MDPI AG
References Mehmood (ref_2) 2020; 28
Chen (ref_10) 2018; 12
ref_14
ref_12
ref_30
Hu (ref_21) 2023; 147
ref_19
ref_17
Chen (ref_13) 2020; 37
ref_15
Chen (ref_3) 2021; 22
Li (ref_11) 2019; 85
ref_25
Zuo (ref_18) 2023; 55
ref_24
ref_23
ref_22
ref_20
ref_1
ref_29
ref_28
ref_27
Zhang (ref_4) 2023; 24
ref_26
ref_9
ref_8
Cao (ref_16) 2019; 150
ref_5
ref_7
ref_6
References_xml – ident: ref_7
– volume: 150
  start-page: 70
  year: 2019
  ident: ref_16
  article-title: Box-level segmentation supervised deep neural networks for accurate and real-time multispectral pedestrian detection
  publication-title: ISPRS J. Photogramm. Remote Sens.
  doi: 10.1016/j.isprsjprs.2019.02.005
– ident: ref_28
– ident: ref_9
– ident: ref_5
– ident: ref_14
  doi: 10.1109/WACV48630.2021.00012
– ident: ref_22
  doi: 10.1109/CVPR52729.2023.00721
– ident: ref_26
– volume: 22
  start-page: 3234
  year: 2021
  ident: ref_3
  article-title: Deep neural network based vehicle and pedestrian detection for autonomous driving: A survey
  publication-title: IEEE Trans. Intell. Transp. Syst.
  doi: 10.1109/TITS.2020.2993926
– ident: ref_27
  doi: 10.1109/CVPR.2019.00075
– ident: ref_25
  doi: 10.1109/CVPR52688.2022.00098
– ident: ref_6
  doi: 10.1109/CVPR.2015.7298706
– ident: ref_23
– ident: ref_8
  doi: 10.5244/C.30.73
– ident: ref_30
  doi: 10.1145/3126686.3126727
– volume: 12
  start-page: 1179
  year: 2018
  ident: ref_10
  article-title: Multi-layer fusion techniques using a CNN for multispectral pedestrian detection
  publication-title: IET Comput. Vis.
  doi: 10.1049/iet-cvi.2018.5315
– ident: ref_20
  doi: 10.23919/MVA51890.2021.9511366
– volume: 28
  start-page: 619
  year: 2020
  ident: ref_2
  article-title: Extraction of naturalistic driving patterns with geographic information systems
  publication-title: Mob. Netw. Appl.
– ident: ref_1
  doi: 10.3390/ijgi8050226
– ident: ref_29
– ident: ref_24
  doi: 10.1109/CVPR52688.2022.00571
– ident: ref_12
– volume: 147
  start-page: 110768
  year: 2023
  ident: ref_21
  article-title: Joint dual-stream interaction and multi-scale feature extraction network for multi-spectral pedestrian detection
  publication-title: Appl. Soft Comput.
  doi: 10.1016/j.asoc.2023.110768
– volume: 85
  start-page: 161
  year: 2019
  ident: ref_11
  article-title: Illumination-aware faster R-CNN for robust multispectral pedestrian detection
  publication-title: Pattern Recognit.
  doi: 10.1016/j.patcog.2018.08.005
– volume: 24
  start-page: 10279
  year: 2023
  ident: ref_4
  article-title: Pedestrian Behavior Prediction Using Deep Learning Methods for Urban Scenarios: A Review
  publication-title: IEEE Trans. Intell. Transp. Syst.
  doi: 10.1109/TITS.2023.3281393
– volume: 37
  start-page: 768
  year: 2020
  ident: ref_13
  article-title: Multispectral image fusion based pedestrian detection using a multilayer fused deconvolutional single-shot detector
  publication-title: J. Opt. Soc. Am. A Opt. Image Sci. Vis.
  doi: 10.1364/JOSAA.386410
– volume: 55
  start-page: 2935
  year: 2023
  ident: ref_18
  article-title: LGADet: Light-weight anchor-free multispectral pedestrian detection with mixed local and global attention
  publication-title: Neural Process. Lett.
  doi: 10.1007/s11063-022-10991-7
– ident: ref_15
– ident: ref_19
– ident: ref_17
  doi: 10.1109/WACV51458.2022.00339
SSID ssj0002149111
Score 2.3414803
Snippet Recent research demonstrates that the fusion of multimodal images can improve the performance of pedestrian detectors under low-illumination environments....
SourceID doaj
proquest
crossref
SourceType Open Website
Aggregation Database
Enrichment Source
Index Database
StartPage 352
SubjectTerms Computer networks
convolutional neural network
Datasets
deep learning
Detectors
Illumination
Infrared imagery
Light
Lighting
Methods
multimodal images
Neural networks
pedestrian detection
Pedestrians
Performance enhancement
Sensors
Smog
Transformers
SummonAdditionalLinks – databaseName: DOAJ Directory of Open Access Journals
  dbid: DOA
  link: http://utb.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwrV1LS8QwEA7iSQ_iE1dXyUFPUtxuptnG2_pYRHyBCnoqbScBRbuydvXqf_Af-kucSbtSEPHitUzaMplkvklmvhFiSxOkD6GnAoJEOgDo2MAAcJKjMwiMYRUXCp-d6-MbOLmNbhutvjgnrKIHrhS3a7SLDGBkKDoGjVGmrDGQatQu1Sb17KXk8xrBFO_BXQL-tIqrTHdFcf3um319CCHsdnyNUcMHear-Hzuxdy-DeTFX40LZr_5nQUzZYlHMNtgCl0RJlnR3cXqxJ_vyeoI37ejz_WMw5jOvYJ88EkoWkYe29MfxkmSkL7J9GiK9_9Ki9Z06ilqExsn7QvbHJVc3DMcv8nB0z4cM8irnbXBZ3AyOrg-Og7ppQpCDgjLAng4dcw-jjhS6FFLjQVG3k4GLXEiIo4OQuqxncsRY04pWToeo0JJyCQytiOliWNhVIS3GYRrFWtkoA4U61jTOkocP4zylyLYldiZqTPKaUZwbWzwmFFmw0pOm0lti-1v6uWLS-EVun2fkW4b5r_0DsoqktorkL6toifZkPpN6Ub4kXUPhEN97qrX_-Ma6mOHe81VuS1tMl6Ox3SCEUmab3hi_AGgI4tQ
  priority: 102
  providerName: Directory of Open Access Journals
Title TF-YOLO: A Transformer–Fusion-Based YOLO Detector for Multimodal Pedestrian Detection in Autonomous Driving Scenes
URI https://www.proquest.com/docview/2904938233
https://doaj.org/article/96f594d5919846d5b3e994a6d6fa69a5
Volume 14
hasFullText 1
inHoldings 1
isFullTextHit
isPrint
link http://utb.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwfV1LT9tAEF4VuNAD6gPU8Ij20J6QRTb7iJcLSgoBVeWhAhKcLNuzW1G1NiROe-1_4B_2l3RmswmRUHu1Zy1rZ2fmm9l5MPbeIKQXqicThEQmUarjEqsUJTl6C4owrKRC4dMzc3KtPt3omxhwG8e0yplODIoa6pJi5Htdi1iWLq3kwf1DQlOj6HY1jtBYYiuoglN0vlYGR2cXX-ZRli46ACjN04x3if793i_385tQotsJtUYLtii07H-mkYOZGb5iaxEf8v6Uoa_ZC1e9YS8Xuga-ZQ2eqNvzz-f7vM-vZrjTjf78fhxOKPaVDNAyAScSfuiaEJbnSMNDse2PGvD7Fw5cmNhRRRJcx-8q3p80VOVQT8b8cHRHwQZ-WZI6XGfXw6OrjydJHJ6QlEqqJoGeEZ56EIPREnyuchvAUbdTKK-9QOTRAZX7omdLgNSgZEtvBEhwoAsERRtsuaor945xB6nIdWqk04WSYFKD6xxaepGWOXq4LbY728asjJ3FacDF9ww9DNr0bHHTW-zDnPp-2lHjH3QD4sichvpghwf16GsWxSqzxmurQFthEUjhj0tnrcoNGJ8bm-sW257xM4vCOc6ejtLm_19vsVWaLj_NXtlmy81o4nYQgzRFmy2lw-N2PG7t4Mn_BWQY3ew
linkProvider ProQuest
linkToHtml http://utb.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwtV3NbtNAEF5V7QE4oPInAgX2QE_Iqu39qRepqlJCSGnaIpFK5WRszy5qVeySOFTceAfeg4fiSZhZ2yESgluv9qxlzc7OfDM7P4w91wjpI7ktAoREOpAytIGRkpIcnQFJGFZQofDhkR6dyLen6nSF_exqYSitstOJXlFDVVCMfCs2iGXp0krsXn4JaGoU3a52IzQasTiw367QZZvt7A9wfzfjePh68moUtFMFgkIKWQeAjruj5ryglQCXycx41BCHuXTKRWiSQ5CZy7dNAZBoFHnhdAQCLKhcU6MDVPlrUghDJyoZvlnEdGJ0N1B3NPn1-D7curJfzyMZxaGvbFqyfH5AwF_63xu14Tq73aJR3m_E5w5bseVddmupR-E9VqP8fjgeH7_kfT7pUK6d_vr-YzinSFuwh3YQOJHwga39JQBHGu5Lez9XgN9_Z8H6-SBlS4Lr-FnJ-_Oaaiqq-YwPpmcU2uDvC1K-99nJtTD1AVstq9I-ZNxCEmUq0cKqXArQicZ1FnFFlBQZ-tM99qJjY1q0fcxpnMZFiv4MMT1dZnqPbS6oL5v-Hf-g26MdWdBQ123_oJp-SttDnBrtlJGgTGQQtuGPC2uMzDRol2mTqR7b6PYzbVXBLP0juI_-__oZuzGaHI7T8f7RwWN2k-baN3kzG2y1ns7tE0Q_df7UixxnH69bxn8Df6UXEA
linkToPdf http://utb.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwtV1Lb9NAEF5VqYTgUJWXSCmwB3pCVmzvI16kCiWkUUtLGkErlZOxPbuoVbFL4lBx4z_wb_g5_BJm_AiRENx6tceWtft55pvZeTD2XCOlD2RfeEiJtCelbz0jJSU5OgOSOKygQuG3E71_Kt-cqbM19rOthaG0ylYnVooaioxi5L3QIJelQyvRc01axHQ0fnX1xaMJUnTS2o7TqCFyaL9do_s23z0Y4V7vhOF47-T1vtdMGPAyKWTpATrxjhr1glYCXCITUzGI0E-lUy5A8-yDTFzaNxlApBH-wukABFhQqaamB6j-1_voFfkdtj7cm0zfLSM8ITofqEnqbHshjN-7tl8vAhmEflXntGIHq3EBf1mDysSNN9lGw035oAbTXbZm83vszkrHwvusRDR_OD46fskH_KTlvHb26_uP8YLibt4QrSJwEuEjW1ZHAhxleFXo-7kAfP_Ugq2mheSNCD7Hz3M-WJRUYVEs5nw0O6dAB3-fkSp-wE5vZFkfsk5e5PYR4xaiIFGRFlalUoCOND5nkWUEUZagd91lL9pljLOmqzkN17iM0buhRY9XF73LdpbSV3U3j3_IDWlHljLUg7u6UMw-xc0vHRvtlJGgTGCQxOGHC2uMTDRol2iTqC7bbvczbhTDPP4D463_337GbiG-46ODyeFjdpuG3NdJNNusU84W9glSoTJ92mCOs483DfPf9nocog
openUrl ctx_ver=Z39.88-2004&ctx_enc=info%3Aofi%2Fenc%3AUTF-8&rfr_id=info%3Asid%2Fsummon.serialssolutions.com&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Ajournal&rft.genre=article&rft.atitle=TF-YOLO%3A+A+Transformer%E2%80%93Fusion-Based+YOLO+Detector+for+Multimodal+Pedestrian+Detection+in+Autonomous+Driving+Scenes&rft.jtitle=World+electric+vehicle+journal&rft.au=Chen%2C+Yunfan&rft.au=Ye%2C+Jinxing&rft.au=Wan%2C+Xiangkui&rft.date=2023-12-01&rft.issn=2032-6653&rft.eissn=2032-6653&rft.volume=14&rft.issue=12&rft.spage=352&rft_id=info:doi/10.3390%2Fwevj14120352&rft.externalDBID=n%2Fa&rft.externalDocID=10_3390_wevj14120352
thumbnail_l http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/lc.gif&issn=2032-6653&client=summon
thumbnail_m http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/mc.gif&issn=2032-6653&client=summon
thumbnail_s http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/sc.gif&issn=2032-6653&client=summon