Exploring Scale-Aware Features for Real-Time Semantic Segmentation of Street Scenes

Real-time semantic segmentation of street scenes is an essential and challenging task for autonomous driving systems, which needs to achieve both high accuracy and efficiency. Moreover, numerous objects and stuff at different scales in street scenes further increase the difficulty of this task. To a...

Full description

Saved in:
Bibliographic Details
Published inIEEE transactions on intelligent transportation systems Vol. 25; no. 5; pp. 3575 - 3587
Main Authors Li, Kaige, Geng, Qichuan, Zhou, Zhong
Format Journal Article
LanguageEnglish
Published New York IEEE 01.05.2024
The Institute of Electrical and Electronics Engineers, Inc. (IEEE)
Subjects
Online AccessGet full text

Cover

Loading…
Abstract Real-time semantic segmentation of street scenes is an essential and challenging task for autonomous driving systems, which needs to achieve both high accuracy and efficiency. Moreover, numerous objects and stuff at different scales in street scenes further increase the difficulty of this task. To address this challenge, we develop a lightweight and high-accuracy network termed Scale-Aware Network (SANet), which aims to selectively aggregate multi-scale features while maintaining high efficiency. In SANet, we first design a Selective Context Encoding (SCE) module, which considers the intrinsic differences of various pixels to selectively encode private contexts for each pixel, thus learning more desirable contextual features while reducing redundancy. With the context embedding in hand, we then design a Selective Feature Fusion (SFF) module to recursively fuses them with multiple features at different levels or scales to generate scale-aware features, where each feature map contains scale-specific information. Extensive experiments on challenging street scene datasets, i.e., Cityscapes and CamVid, illustrate that our SANet achieves a leading trade-off between segmentation accuracy and speed. Concretely, our method yields 78.1% mIoU at 109.0 FPS on the Cityscapes test set and 77.2% mIoU at 250.4 FPS on the CamVid test set. Code will be available at https://github.com/kaigelee/SANet .
AbstractList Real-time semantic segmentation of street scenes is an essential and challenging task for autonomous driving systems, which needs to achieve both high accuracy and efficiency. Moreover, numerous objects and stuff at different scales in street scenes further increase the difficulty of this task. To address this challenge, we develop a lightweight and high-accuracy network termed Scale-Aware Network (SANet), which aims to selectively aggregate multi-scale features while maintaining high efficiency. In SANet, we first design a Selective Context Encoding (SCE) module, which considers the intrinsic differences of various pixels to selectively encode private contexts for each pixel, thus learning more desirable contextual features while reducing redundancy. With the context embedding in hand, we then design a Selective Feature Fusion (SFF) module to recursively fuses them with multiple features at different levels or scales to generate scale-aware features, where each feature map contains scale-specific information. Extensive experiments on challenging street scene datasets, i.e., Cityscapes and CamVid, illustrate that our SANet achieves a leading trade-off between segmentation accuracy and speed. Concretely, our method yields 78.1% mIoU at 109.0 FPS on the Cityscapes test set and 77.2% mIoU at 250.4 FPS on the CamVid test set. Code will be available at https://github.com/kaigelee/SANet .
Author Li, Kaige
Zhou, Zhong
Geng, Qichuan
Author_xml – sequence: 1
  givenname: Kaige
  orcidid: 0000-0002-1716-4381
  surname: Li
  fullname: Li, Kaige
  email: lkg@buaa.edu.cn
  organization: State Key Laboratory of Virtual Reality Technology and Systems, Beihang University, Beijing, China
– sequence: 2
  givenname: Qichuan
  orcidid: 0000-0002-0046-5794
  surname: Geng
  fullname: Geng, Qichuan
  email: gengqichuan1989@cnu.edu.cn
  organization: Information Engineering College, Capital Normal University, Beijing, China
– sequence: 3
  givenname: Zhong
  orcidid: 0000-0002-5825-7517
  surname: Zhou
  fullname: Zhou, Zhong
  email: zz@buaa.edu.cn
  organization: State Key Laboratory of Virtual Reality Technology and Systems, Beihang University, Beijing, China
BookMark eNp9UMtKw0AUHaSCbfUDBBcB16nzyE1mlqW0WigIJq6H6eSmpORRJ1PUvzchXYgLV_dwOQ_OmZFJ0zZIyD2jC8aoesq2WbrglIuFEIJGSl6RKQOQIaUsngyYR6GiQG_IrOuO_TcCxqYkXX-dqtaVzSFIrakwXH4ah8EGjT877IKidcEbmirMyhqDFGvT-NL24FBj440v2yZoiyD1DtH3Fthgd0uuC1N1eHe5c_K-WWerl3D3-rxdLXeh5SryYURVhDkzdA_GUsM5ylxIMHHCOQcwCBKVkn2LPYpib5OC5yyKcwkWlKRGzMnj6Hty7ccZO6-P7dk1faQWFIDFCUSqZ7GRZV3bdQ4LfXJlbdy3ZlQP2-lhOz1spy_b9Zrkj8aWY1nvTFn9q3wYlSUi_koSDChw8QOz9X2V
CODEN ITISFG
CitedBy_id crossref_primary_10_1109_TITS_2024_3519162
Cites_doi 10.1109/CVPR.2016.89
10.1016/j.patcog.2020.107611
10.1109/CVPR.2019.00326
10.1109/CVPR.2018.00474
10.1016/j.eswa.2022.118537
10.1109/TIP.2020.3042065
10.1109/TITS.2017.2750080
10.1007/s00521-022-06932-z
10.1109/ICRA40945.2020.9196599
10.1109/CVPR46437.2021.00959
10.1109/TITS.2019.2913883
10.1109/CVPR.2018.00813
10.1109/CVPR.2019.00060
10.1109/ICCV.2019.00069
10.1109/CVPR.2019.00975
10.1007/s11263-021-01515-2
10.1109/CVPR52729.2023.01871
10.1109/CVPR46437.2021.00405
10.1109/TITS.2021.3127553
10.1109/CVPR.2019.00067
10.48550/ARXIV.1604.01685
10.1109/CVPR.2017.549
10.1109/ICIP.2019.8803025
10.1007/978-3-030-01261-8_20
10.1109/ICRA46639.2022.9811930
10.1109/TIM.2021.3070611
10.1109/TITS.2021.3115705
10.1109/CVPR.2015.7298965
10.1109/CVPR52688.2022.01637
10.1109/TITS.2020.3044672
10.1609/aaai.v34i07.6805
10.1007/978-3-540-88682-2_5
10.1145/3065386
10.1109/CVPR.2018.00388
10.1109/TITS.2022.3228042
10.1109/TITS.2022.3161141
10.1109/TITS.2020.3037727
10.1109/CVPR.2018.00745
10.1109/TITS.2020.2980426
10.1016/j.neucom.2021.12.003
10.48550/arXiv.1802.02611
10.1109/TNNLS.2022.3221745
10.1109/TITS.2022.3150350
10.1007/s11263-015-0816-y
10.1109/TMM.2021.3088639
10.1109/WACV48630.2021.00360
10.1109/ICIP.2019.8803154
10.1109/CVPR.2019.00271
10.1109/TITS.2022.3182311
10.1109/CVPR.2016.90
10.1016/j.neucom.2022.11.094
10.1109/CVPR.2017.660
10.1109/CVPR42600.2020.00426
ContentType Journal Article
Copyright Copyright The Institute of Electrical and Electronics Engineers, Inc. (IEEE) 2024
Copyright_xml – notice: Copyright The Institute of Electrical and Electronics Engineers, Inc. (IEEE) 2024
DBID 97E
RIA
RIE
AAYXX
CITATION
7SC
7SP
8FD
FR3
JQ2
KR7
L7M
L~C
L~D
DOI 10.1109/TITS.2023.3330498
DatabaseName IEEE Xplore (IEEE)
IEEE All-Society Periodicals Package (ASPP) 1998–Present
IEEE Electronic Library (IEL)
CrossRef
Computer and Information Systems Abstracts
Electronics & Communications Abstracts
Technology Research Database
Engineering Research Database
ProQuest Computer Science Collection
Civil Engineering Abstracts
Advanced Technologies Database with Aerospace
Computer and Information Systems Abstracts – Academic
Computer and Information Systems Abstracts Professional
DatabaseTitle CrossRef
Civil Engineering Abstracts
Technology Research Database
Computer and Information Systems Abstracts – Academic
Electronics & Communications Abstracts
ProQuest Computer Science Collection
Computer and Information Systems Abstracts
Engineering Research Database
Advanced Technologies Database with Aerospace
Computer and Information Systems Abstracts Professional
DatabaseTitleList
Civil Engineering Abstracts
Database_xml – sequence: 1
  dbid: RIE
  name: IEEE Electronic Library (IEL)
  url: https://proxy.k.utb.cz/login?url=https://ieeexplore.ieee.org/
  sourceTypes: Publisher
DeliveryMethod fulltext_linktorsrc
Discipline Engineering
EISSN 1558-0016
EndPage 3587
ExternalDocumentID 10_1109_TITS_2023_3330498
10315052
Genre orig-research
GrantInformation_xml – fundername: National Natural Science Foundation of China
  grantid: 62272018
  funderid: 10.13039/501100001809
GroupedDBID -~X
0R~
29I
4.4
5GY
5VS
6IK
97E
AAJGR
AARMG
AASAJ
AAWTH
ABAZT
ABQJQ
ABVLG
ACGFO
ACGFS
ACIWK
ACNCT
AENEX
AETIX
AGQYO
AGSQL
AHBIQ
AIBXA
AKJIK
AKQYR
ALMA_UNASSIGNED_HOLDINGS
ATWAV
BEFXN
BFFAM
BGNUA
BKEBE
BPEOZ
CS3
DU5
EBS
EJD
HZ~
H~9
IFIPE
IPLJI
JAVBF
LAI
M43
O9-
OCL
P2P
PQQKQ
RIA
RIE
RNS
ZY4
AAYXX
CITATION
RIG
7SC
7SP
8FD
FR3
JQ2
KR7
L7M
L~C
L~D
ID FETCH-LOGICAL-c294t-4094ed1a0b5ac0a22e8d385a6722255ae58e998155be3fbc7f2d146d85c5980a3
IEDL.DBID RIE
ISSN 1524-9050
IngestDate Sun Jun 29 15:20:28 EDT 2025
Tue Jul 01 04:29:14 EDT 2025
Thu Apr 24 23:09:07 EDT 2025
Wed Aug 27 02:33:19 EDT 2025
IsPeerReviewed true
IsScholarly true
Issue 5
Language English
License https://ieeexplore.ieee.org/Xplorehelp/downloads/license-information/IEEE.html
https://doi.org/10.15223/policy-029
https://doi.org/10.15223/policy-037
LinkModel DirectLink
MergedId FETCHMERGED-LOGICAL-c294t-4094ed1a0b5ac0a22e8d385a6722255ae58e998155be3fbc7f2d146d85c5980a3
Notes ObjectType-Article-1
SourceType-Scholarly Journals-1
ObjectType-Feature-2
content type line 14
ORCID 0000-0002-0046-5794
0000-0002-5825-7517
0000-0002-1716-4381
PQID 3055167549
PQPubID 75735
PageCount 13
ParticipantIDs crossref_primary_10_1109_TITS_2023_3330498
proquest_journals_3055167549
ieee_primary_10315052
crossref_citationtrail_10_1109_TITS_2023_3330498
ProviderPackageCode CITATION
AAYXX
PublicationCentury 2000
PublicationDate 2024-05-01
PublicationDateYYYYMMDD 2024-05-01
PublicationDate_xml – month: 05
  year: 2024
  text: 2024-05-01
  day: 01
PublicationDecade 2020
PublicationPlace New York
PublicationPlace_xml – name: New York
PublicationTitle IEEE transactions on intelligent transportation systems
PublicationTitleAbbrev TITS
PublicationYear 2024
Publisher IEEE
The Institute of Electrical and Electronics Engineers, Inc. (IEEE)
Publisher_xml – name: IEEE
– name: The Institute of Electrical and Electronics Engineers, Inc. (IEEE)
References ref13
ref56
ref15
ref14
ref58
ref53
ref52
ref11
ref55
ref10
ref54
Liu (ref37) 2022
ref17
ref16
ref19
ref18
ref51
ref50
Chen (ref45) 2017
ref46
Wan (ref60) 2023
ref48
ref47
ref42
ref41
ref44
ref43
ref49
ref8
ref7
ref9
ref4
ref3
ref6
ref5
Poudel (ref32)
Paszke (ref12) 2016
ref35
ref34
ref36
ref31
ref30
ref33
ref2
ref1
ref39
ref38
Si (ref59)
ref24
Xie (ref40); 34
ref23
ref26
ref25
ref20
ref22
ref21
ref28
ref27
ref29
ref61
Wang (ref57)
References_xml – volume: 34
  start-page: 12077
  volume-title: Proc. Adv. Neural Inf. Process. Sys. (NIPS)
  ident: ref40
  article-title: SegFormer: Simple and efficient design for semantic segmentation with transformers
– ident: ref48
  doi: 10.1109/CVPR.2016.89
– ident: ref18
  doi: 10.1016/j.patcog.2020.107611
– ident: ref29
  doi: 10.1109/CVPR.2019.00326
– ident: ref31
  doi: 10.1109/CVPR.2018.00474
– ident: ref46
  doi: 10.1016/j.eswa.2022.118537
– ident: ref52
  doi: 10.1109/TIP.2020.3042065
– year: 2016
  ident: ref12
  article-title: ENet: A deep neural network architecture for real-time semantic segmentation
  publication-title: arXiv:1606.02147
– ident: ref13
  doi: 10.1109/TITS.2017.2750080
– ident: ref17
  doi: 10.1007/s00521-022-06932-z
– ident: ref55
  doi: 10.1109/ICRA40945.2020.9196599
– ident: ref51
  doi: 10.1109/CVPR46437.2021.00959
– ident: ref2
  doi: 10.1109/TITS.2019.2913883
– ident: ref43
  doi: 10.1109/CVPR.2018.00813
– start-page: 1
  volume-title: Proc. Adv. Neural Inf. Process. Syst.
  ident: ref57
  article-title: RTFormer: Efficient design for real-time semantic segmentation with transformer
– ident: ref25
  doi: 10.1109/CVPR.2019.00060
– ident: ref42
  doi: 10.1109/ICCV.2019.00069
– ident: ref19
  doi: 10.1109/CVPR.2019.00975
– ident: ref24
  doi: 10.1007/s11263-021-01515-2
– ident: ref54
  doi: 10.1109/CVPR52729.2023.01871
– ident: ref56
  doi: 10.1109/CVPR46437.2021.00405
– ident: ref15
  doi: 10.1109/TITS.2021.3127553
– ident: ref36
  doi: 10.1109/CVPR.2019.00067
– year: 2017
  ident: ref45
  article-title: Rethinking atrous convolution for semantic image segmentation
  publication-title: arXiv:1706.05587
– ident: ref5
  doi: 10.48550/ARXIV.1604.01685
– ident: ref47
  doi: 10.1109/CVPR.2017.549
– ident: ref49
  doi: 10.1109/ICIP.2019.8803025
– ident: ref23
  doi: 10.1007/978-3-030-01261-8_20
– ident: ref6
  doi: 10.1109/ICRA46639.2022.9811930
– ident: ref30
  doi: 10.1109/TIM.2021.3070611
– ident: ref3
  doi: 10.1109/TITS.2021.3115705
– ident: ref28
  doi: 10.1109/CVPR.2015.7298965
– ident: ref38
  doi: 10.1109/CVPR52688.2022.01637
– ident: ref22
  doi: 10.1109/TITS.2020.3044672
– ident: ref10
  doi: 10.1609/aaai.v34i07.6805
– ident: ref27
  doi: 10.1007/978-3-540-88682-2_5
– year: 2022
  ident: ref37
  article-title: TransKD: Transformer knowledge distillation for efficient semantic segmentation
  publication-title: arXiv:2202.13393
– ident: ref26
  doi: 10.1145/3065386
– ident: ref41
  doi: 10.1109/CVPR.2018.00388
– ident: ref21
  doi: 10.1109/TITS.2022.3228042
– ident: ref39
  doi: 10.1109/TITS.2022.3161141
– ident: ref14
  doi: 10.1109/TITS.2020.3037727
– ident: ref44
  doi: 10.1109/CVPR.2018.00745
– ident: ref20
  doi: 10.1109/TITS.2020.2980426
– ident: ref53
  doi: 10.1016/j.neucom.2021.12.003
– ident: ref9
  doi: 10.48550/arXiv.1802.02611
– ident: ref58
  doi: 10.1109/TNNLS.2022.3221745
– ident: ref7
  doi: 10.1109/TITS.2022.3150350
– ident: ref33
  doi: 10.1007/s11263-015-0816-y
– ident: ref34
  doi: 10.1109/TMM.2021.3088639
– ident: ref50
  doi: 10.1109/WACV48630.2021.00360
– ident: ref16
  doi: 10.1109/ICIP.2019.8803154
– ident: ref35
  doi: 10.1109/CVPR.2019.00271
– ident: ref4
  doi: 10.1109/TITS.2022.3182311
– ident: ref11
  doi: 10.1109/CVPR.2016.90
– ident: ref1
  doi: 10.1016/j.neucom.2022.11.094
– ident: ref8
  doi: 10.1109/CVPR.2017.660
– start-page: 1
  volume-title: Proc. Brit. Mach. Vis. Conf.
  ident: ref59
  article-title: Real-time semantic segmentation via multiply spatial fusion network
– start-page: 1
  volume-title: Proc. Brit. Mach. Vis. Conf.
  ident: ref32
  article-title: Fast-SCNN: Fast semantic segmentation network
– ident: ref61
  doi: 10.1109/CVPR42600.2020.00426
– year: 2023
  ident: ref60
  article-title: SeaFormer: Squeeze-enhanced axial transformer for mobile semantic segmentation
  publication-title: arXiv:2301.13156
SSID ssj0014511
Score 2.427265
Snippet Real-time semantic segmentation of street scenes is an essential and challenging task for autonomous driving systems, which needs to achieve both high accuracy...
SourceID proquest
crossref
ieee
SourceType Aggregation Database
Enrichment Source
Index Database
Publisher
StartPage 3575
SubjectTerms Accuracy
Autonomous driving
Context
Context modeling
Convolutional neural networks
Encoding
Feature detection
Feature maps
Image analysis
Modules
Pixels
Real time
real-time semantic segmentation
Real-time systems
Redundancy
Road traffic
selective context encoding
selective feature fusion
Semantic segmentation
Semantics
Street scene understanding
Test sets
Transformers
Title Exploring Scale-Aware Features for Real-Time Semantic Segmentation of Street Scenes
URI https://ieeexplore.ieee.org/document/10315052
https://www.proquest.com/docview/3055167549
Volume 25
hasFullText 1
inHoldings 1
isFullTextHit
isPrint
link http://utb.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwjV1LSwMxEA7akx58VqxWycGTkO0-knVzFFGqoAdbobclm0xFtK3ULYK_3pnstlRF8ZZDEsJMdjLfzsw3jJ0Mi8w4naRCImwTMjJK6Ji6pqYG3eXMpokvj769S7sP8magBnWxuq-FAQCffAYBDX0s303sjH6VdXxLglChxV1F5FYVay1CBkS05clRYyl0qOYhzCjUnf51vxdQn_AgIfiusy-PkO-q8sMU-_flapPdzU9WpZU8B7OyCOzHN9LGfx99i23UniY_r67GNluB8Q5bX-If3GW9RQYe76GuQJy_mylwcgtnCMM5OrT8Hj1JQYUivAcjVMOTxcHjqC5ZGvPJkFeRbdyC7GaTPVxd9i-6ou6yIGysZSkI4IGLTFgoY0MTx5C5JFMmPSMoqAyoDBCTod9RQDIs7NkwdmheXaas0llokj3WGE_GsM-4ktI5GVHkzUhtVaEVRMqCItr7RLoWC-diz21NQU6dMF5yD0VCnZOmctJUXmuqxU4XS14r_o2_JjdJ8ksTK6G3WHuu3Lz-RN9yojqLEC5JffDLskO2hrvLKr2xzRrldAZH6IKUxbG_ep_lq9S8
linkProvider IEEE
linkToHtml http://utb.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwjV1LT8MwDI4QHIADb8R45sAJKaWPpDRHhJgGjB3YkHar0sRDCLYh2ITEr8dOu2mAQNxySNooTm1_tf2ZseNekRmnk1RIhG1CRkYJHVPX1NSgu5zZNPHl0bettHEvr7uqWxWr-1oYAPDJZxDQ0Mfy3dCO6VfZqW9JECrUuAto-FVUlmtNgwZEteXpUWMpdKgmQcwo1Kedq047oE7hQUIAXmdfzJDvq_JDGXsLU19lrcneysSSp2A8KgL78Y228d-bX2Mrla_Jz8vLsc7mYLDBlmcYCDdZe5qDx9soLRDn7-YVODmGYwTiHF1afoe-pKBSEd6GPgri0eLgoV8VLQ34sMfL2DY-gjTnFruvX3YuGqLqsyBsrOVIEMQDF5mwUMaGJo4hc0mmTHpGYFAZUBkgKkPPo4CkV9izXuxQwbpMWaWz0CTbbH4wHMAO40pK52REsTcjtVWFVhApC4qI7xPpaiycHHtuKxJy6oXxnHswEuqcJJWTpPJKUjV2Ml3yUjJw_DV5i05-ZmJ56DW2PxFuXn2kbzmRnUUImKTe_WXZEVtsdG6befOqdbPHlvBNskx23Gfzo9cxHKBDMioO_TX8BCtq2AU
openUrl ctx_ver=Z39.88-2004&ctx_enc=info%3Aofi%2Fenc%3AUTF-8&rfr_id=info%3Asid%2Fsummon.serialssolutions.com&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Ajournal&rft.genre=article&rft.atitle=Exploring+Scale-Aware+Features+for+Real-Time+Semantic+Segmentation+of+Street+Scenes&rft.jtitle=IEEE+transactions+on+intelligent+transportation+systems&rft.au=Li%2C+Kaige&rft.au=Geng%2C+Qichuan&rft.au=Zhou%2C+Zhong&rft.date=2024-05-01&rft.pub=The+Institute+of+Electrical+and+Electronics+Engineers%2C+Inc.+%28IEEE%29&rft.issn=1524-9050&rft.eissn=1558-0016&rft.volume=25&rft.issue=5&rft.spage=3575&rft_id=info:doi/10.1109%2FTITS.2023.3330498&rft.externalDBID=NO_FULL_TEXT
thumbnail_l http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/lc.gif&issn=1524-9050&client=summon
thumbnail_m http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/mc.gif&issn=1524-9050&client=summon
thumbnail_s http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/sc.gif&issn=1524-9050&client=summon