Exploring Scale-Aware Features for Real-Time Semantic Segmentation of Street Scenes

Real-time semantic segmentation of street scenes is an essential and challenging task for autonomous driving systems, which needs to achieve both high accuracy and efficiency. Moreover, numerous objects and stuff at different scales in street scenes further increase the difficulty of this task. To a...

Full description

Saved in:

Bibliographic Details
Published in	IEEE transactions on intelligent transportation systems Vol. 25; no. 5; pp. 3575 - 3587
Main Authors	Li, Kaige, Geng, Qichuan, Zhou, Zhong
Format	Journal Article
Language	English
Published	New York IEEE 01.05.2024 The Institute of Electrical and Electronics Engineers, Inc. (IEEE)
Subjects	Accuracy Autonomous driving Context Context modeling Convolutional neural networks Encoding Feature detection Feature maps Image analysis Modules Pixels Real time real-time semantic segmentation Real-time systems Redundancy Road traffic selective context encoding selective feature fusion Semantic segmentation Semantics Street scene understanding Test sets Transformers
Online Access	Get full text

Cover

Loading…

Abstract	Real-time semantic segmentation of street scenes is an essential and challenging task for autonomous driving systems, which needs to achieve both high accuracy and efficiency. Moreover, numerous objects and stuff at different scales in street scenes further increase the difficulty of this task. To address this challenge, we develop a lightweight and high-accuracy network termed Scale-Aware Network (SANet), which aims to selectively aggregate multi-scale features while maintaining high efficiency. In SANet, we first design a Selective Context Encoding (SCE) module, which considers the intrinsic differences of various pixels to selectively encode private contexts for each pixel, thus learning more desirable contextual features while reducing redundancy. With the context embedding in hand, we then design a Selective Feature Fusion (SFF) module to recursively fuses them with multiple features at different levels or scales to generate scale-aware features, where each feature map contains scale-specific information. Extensive experiments on challenging street scene datasets, i.e., Cityscapes and CamVid, illustrate that our SANet achieves a leading trade-off between segmentation accuracy and speed. Concretely, our method yields 78.1% mIoU at 109.0 FPS on the Cityscapes test set and 77.2% mIoU at 250.4 FPS on the CamVid test set. Code will be available at https://github.com/kaigelee/SANet .
AbstractList	Real-time semantic segmentation of street scenes is an essential and challenging task for autonomous driving systems, which needs to achieve both high accuracy and efficiency. Moreover, numerous objects and stuff at different scales in street scenes further increase the difficulty of this task. To address this challenge, we develop a lightweight and high-accuracy network termed Scale-Aware Network (SANet), which aims to selectively aggregate multi-scale features while maintaining high efficiency. In SANet, we first design a Selective Context Encoding (SCE) module, which considers the intrinsic differences of various pixels to selectively encode private contexts for each pixel, thus learning more desirable contextual features while reducing redundancy. With the context embedding in hand, we then design a Selective Feature Fusion (SFF) module to recursively fuses them with multiple features at different levels or scales to generate scale-aware features, where each feature map contains scale-specific information. Extensive experiments on challenging street scene datasets, i.e., Cityscapes and CamVid, illustrate that our SANet achieves a leading trade-off between segmentation accuracy and speed. Concretely, our method yields 78.1% mIoU at 109.0 FPS on the Cityscapes test set and 77.2% mIoU at 250.4 FPS on the CamVid test set. Code will be available at https://github.com/kaigelee/SANet .
Author	Li, Kaige Zhou, Zhong Geng, Qichuan
Author_xml	– sequence: 1 givenname: Kaige orcidid: 0000-0002-1716-4381 surname: Li fullname: Li, Kaige email: lkg@buaa.edu.cn organization: State Key Laboratory of Virtual Reality Technology and Systems, Beihang University, Beijing, China – sequence: 2 givenname: Qichuan orcidid: 0000-0002-0046-5794 surname: Geng fullname: Geng, Qichuan email: gengqichuan1989@cnu.edu.cn organization: Information Engineering College, Capital Normal University, Beijing, China – sequence: 3 givenname: Zhong orcidid: 0000-0002-5825-7517 surname: Zhou fullname: Zhou, Zhong email: zz@buaa.edu.cn organization: State Key Laboratory of Virtual Reality Technology and Systems, Beihang University, Beijing, China
BookMark	eNp9UMtKw0AUHaSCbfUDBBcB16nzyE1mlqW0WigIJq6H6eSmpORRJ1PUvzchXYgLV_dwOQ_OmZFJ0zZIyD2jC8aoesq2WbrglIuFEIJGSl6RKQOQIaUsngyYR6GiQG_IrOuO_TcCxqYkXX-dqtaVzSFIrakwXH4ah8EGjT877IKidcEbmirMyhqDFGvT-NL24FBj440v2yZoiyD1DtH3Fthgd0uuC1N1eHe5c_K-WWerl3D3-rxdLXeh5SryYURVhDkzdA_GUsM5ylxIMHHCOQcwCBKVkn2LPYpib5OC5yyKcwkWlKRGzMnj6Hty7ccZO6-P7dk1faQWFIDFCUSqZ7GRZV3bdQ4LfXJlbdy3ZlQP2-lhOz1spy_b9Zrkj8aWY1nvTFn9q3wYlSUi_koSDChw8QOz9X2V
CODEN	ITISFG
CitedBy_id	crossref_primary_10_1109_TITS_2024_3519162
Cites_doi	10.1109/CVPR.2016.89 10.1016/j.patcog.2020.107611 10.1109/CVPR.2019.00326 10.1109/CVPR.2018.00474 10.1016/j.eswa.2022.118537 10.1109/TIP.2020.3042065 10.1109/TITS.2017.2750080 10.1007/s00521-022-06932-z 10.1109/ICRA40945.2020.9196599 10.1109/CVPR46437.2021.00959 10.1109/TITS.2019.2913883 10.1109/CVPR.2018.00813 10.1109/CVPR.2019.00060 10.1109/ICCV.2019.00069 10.1109/CVPR.2019.00975 10.1007/s11263-021-01515-2 10.1109/CVPR52729.2023.01871 10.1109/CVPR46437.2021.00405 10.1109/TITS.2021.3127553 10.1109/CVPR.2019.00067 10.48550/ARXIV.1604.01685 10.1109/CVPR.2017.549 10.1109/ICIP.2019.8803025 10.1007/978-3-030-01261-8_20 10.1109/ICRA46639.2022.9811930 10.1109/TIM.2021.3070611 10.1109/TITS.2021.3115705 10.1109/CVPR.2015.7298965 10.1109/CVPR52688.2022.01637 10.1109/TITS.2020.3044672 10.1609/aaai.v34i07.6805 10.1007/978-3-540-88682-2_5 10.1145/3065386 10.1109/CVPR.2018.00388 10.1109/TITS.2022.3228042 10.1109/TITS.2022.3161141 10.1109/TITS.2020.3037727 10.1109/CVPR.2018.00745 10.1109/TITS.2020.2980426 10.1016/j.neucom.2021.12.003 10.48550/arXiv.1802.02611 10.1109/TNNLS.2022.3221745 10.1109/TITS.2022.3150350 10.1007/s11263-015-0816-y 10.1109/TMM.2021.3088639 10.1109/WACV48630.2021.00360 10.1109/ICIP.2019.8803154 10.1109/CVPR.2019.00271 10.1109/TITS.2022.3182311 10.1109/CVPR.2016.90 10.1016/j.neucom.2022.11.094 10.1109/CVPR.2017.660 10.1109/CVPR42600.2020.00426
ContentType	Journal Article
Copyright	Copyright The Institute of Electrical and Electronics Engineers, Inc. (IEEE) 2024
Copyright_xml	– notice: Copyright The Institute of Electrical and Electronics Engineers, Inc. (IEEE) 2024
DBID	97E RIA RIE AAYXX CITATION 7SC 7SP 8FD FR3 JQ2 KR7 L7M L~C L~D
DOI	10.1109/TITS.2023.3330498
DatabaseName	IEEE Xplore (IEEE) IEEE All-Society Periodicals Package (ASPP) 1998–Present IEEE Electronic Library (IEL) CrossRef Computer and Information Systems Abstracts Electronics & Communications Abstracts Technology Research Database Engineering Research Database ProQuest Computer Science Collection Civil Engineering Abstracts Advanced Technologies Database with Aerospace Computer and Information Systems Abstracts Academic Computer and Information Systems Abstracts Professional
DatabaseTitle	CrossRef Civil Engineering Abstracts Technology Research Database Computer and Information Systems Abstracts – Academic Electronics & Communications Abstracts ProQuest Computer Science Collection Computer and Information Systems Abstracts Engineering Research Database Advanced Technologies Database with Aerospace Computer and Information Systems Abstracts Professional
DatabaseTitleList	Civil Engineering Abstracts
Database_xml	– sequence: 1 dbid: RIE name: IEEE Electronic Library (IEL) url: https://proxy.k.utb.cz/login?url=https://ieeexplore.ieee.org/ sourceTypes: Publisher
DeliveryMethod	fulltext_linktorsrc
Discipline	Engineering
EISSN	1558-0016
EndPage	3587
ExternalDocumentID	10_1109_TITS_2023_3330498 10315052
Genre	orig-research
GrantInformation_xml	– fundername: National Natural Science Foundation of China grantid: 62272018 funderid: 10.13039/501100001809
GroupedDBID	-~X 0R~ 29I 4.4 5GY 5VS 6IK 97E AAJGR AARMG AASAJ AAWTH ABAZT ABQJQ ABVLG ACGFO ACGFS ACIWK ACNCT AENEX AETIX AGQYO AGSQL AHBIQ AIBXA AKJIK AKQYR ALMA_UNASSIGNED_HOLDINGS ATWAV BEFXN BFFAM BGNUA BKEBE BPEOZ CS3 DU5 EBS EJD HZ~ H~9 IFIPE IPLJI JAVBF LAI M43 O9- OCL P2P PQQKQ RIA RIE RNS ZY4 AAYXX CITATION RIG 7SC 7SP 8FD FR3 JQ2 KR7 L7M L~C L~D
ID	FETCH-LOGICAL-c294t-4094ed1a0b5ac0a22e8d385a6722255ae58e998155be3fbc7f2d146d85c5980a3
IEDL.DBID	RIE
ISSN	1524-9050
IngestDate	Sun Jun 29 15:20:28 EDT 2025 Tue Jul 01 04:29:14 EDT 2025 Thu Apr 24 23:09:07 EDT 2025 Wed Aug 27 02:33:19 EDT 2025
IsPeerReviewed	true
IsScholarly	true
Issue	5
Language	English
License	https://ieeexplore.ieee.org/Xplorehelp/downloads/license-information/IEEE.html https://doi.org/10.15223/policy-029 https://doi.org/10.15223/policy-037
LinkModel	DirectLink
MergedId	FETCHMERGED-LOGICAL-c294t-4094ed1a0b5ac0a22e8d385a6722255ae58e998155be3fbc7f2d146d85c5980a3
Notes	ObjectType-Article-1 SourceType-Scholarly Journals-1 ObjectType-Feature-2 content type line 14
ORCID	0000-0002-0046-5794 0000-0002-5825-7517 0000-0002-1716-4381
PQID	3055167549
PQPubID	75735
PageCount	13
ParticipantIDs	crossref_primary_10_1109_TITS_2023_3330498 proquest_journals_3055167549 ieee_primary_10315052 crossref_citationtrail_10_1109_TITS_2023_3330498
ProviderPackageCode	CITATION AAYXX
PublicationCentury	2000
PublicationDate	2024-05-01
PublicationDateYYYYMMDD	2024-05-01
PublicationDate_xml	– month: 05 year: 2024 text: 2024-05-01 day: 01
PublicationDecade	2020
PublicationPlace	New York
PublicationPlace_xml	– name: New York
PublicationTitle	IEEE transactions on intelligent transportation systems
PublicationTitleAbbrev	TITS
PublicationYear	2024
Publisher	IEEE The Institute of Electrical and Electronics Engineers, Inc. (IEEE)
Publisher_xml	– name: IEEE – name: The Institute of Electrical and Electronics Engineers, Inc. (IEEE)
References	ref13 ref56 ref15 ref14 ref58 ref53 ref52 ref11 ref55 ref10 ref54 Liu (ref37) 2022 ref17 ref16 ref19 ref18 ref51 ref50 Chen (ref45) 2017 ref46 Wan (ref60) 2023 ref48 ref47 ref42 ref41 ref44 ref43 ref49 ref8 ref7 ref9 ref4 ref3 ref6 ref5 Poudel (ref32) Paszke (ref12) 2016 ref35 ref34 ref36 ref31 ref30 ref33 ref2 ref1 ref39 ref38 Si (ref59) ref24 Xie (ref40); 34 ref23 ref26 ref25 ref20 ref22 ref21 ref28 ref27 ref29 ref61 Wang (ref57)
References_xml	– volume: 34 start-page: 12077 volume-title: Proc. Adv. Neural Inf. Process. Sys. (NIPS) ident: ref40 article-title: SegFormer: Simple and efficient design for semantic segmentation with transformers – ident: ref48 doi: 10.1109/CVPR.2016.89 – ident: ref18 doi: 10.1016/j.patcog.2020.107611 – ident: ref29 doi: 10.1109/CVPR.2019.00326 – ident: ref31 doi: 10.1109/CVPR.2018.00474 – ident: ref46 doi: 10.1016/j.eswa.2022.118537 – ident: ref52 doi: 10.1109/TIP.2020.3042065 – year: 2016 ident: ref12 article-title: ENet: A deep neural network architecture for real-time semantic segmentation publication-title: arXiv:1606.02147 – ident: ref13 doi: 10.1109/TITS.2017.2750080 – ident: ref17 doi: 10.1007/s00521-022-06932-z – ident: ref55 doi: 10.1109/ICRA40945.2020.9196599 – ident: ref51 doi: 10.1109/CVPR46437.2021.00959 – ident: ref2 doi: 10.1109/TITS.2019.2913883 – ident: ref43 doi: 10.1109/CVPR.2018.00813 – start-page: 1 volume-title: Proc. Adv. Neural Inf. Process. Syst. ident: ref57 article-title: RTFormer: Efficient design for real-time semantic segmentation with transformer – ident: ref25 doi: 10.1109/CVPR.2019.00060 – ident: ref42 doi: 10.1109/ICCV.2019.00069 – ident: ref19 doi: 10.1109/CVPR.2019.00975 – ident: ref24 doi: 10.1007/s11263-021-01515-2 – ident: ref54 doi: 10.1109/CVPR52729.2023.01871 – ident: ref56 doi: 10.1109/CVPR46437.2021.00405 – ident: ref15 doi: 10.1109/TITS.2021.3127553 – ident: ref36 doi: 10.1109/CVPR.2019.00067 – year: 2017 ident: ref45 article-title: Rethinking atrous convolution for semantic image segmentation publication-title: arXiv:1706.05587 – ident: ref5 doi: 10.48550/ARXIV.1604.01685 – ident: ref47 doi: 10.1109/CVPR.2017.549 – ident: ref49 doi: 10.1109/ICIP.2019.8803025 – ident: ref23 doi: 10.1007/978-3-030-01261-8_20 – ident: ref6 doi: 10.1109/ICRA46639.2022.9811930 – ident: ref30 doi: 10.1109/TIM.2021.3070611 – ident: ref3 doi: 10.1109/TITS.2021.3115705 – ident: ref28 doi: 10.1109/CVPR.2015.7298965 – ident: ref38 doi: 10.1109/CVPR52688.2022.01637 – ident: ref22 doi: 10.1109/TITS.2020.3044672 – ident: ref10 doi: 10.1609/aaai.v34i07.6805 – ident: ref27 doi: 10.1007/978-3-540-88682-2_5 – year: 2022 ident: ref37 article-title: TransKD: Transformer knowledge distillation for efficient semantic segmentation publication-title: arXiv:2202.13393 – ident: ref26 doi: 10.1145/3065386 – ident: ref41 doi: 10.1109/CVPR.2018.00388 – ident: ref21 doi: 10.1109/TITS.2022.3228042 – ident: ref39 doi: 10.1109/TITS.2022.3161141 – ident: ref14 doi: 10.1109/TITS.2020.3037727 – ident: ref44 doi: 10.1109/CVPR.2018.00745 – ident: ref20 doi: 10.1109/TITS.2020.2980426 – ident: ref53 doi: 10.1016/j.neucom.2021.12.003 – ident: ref9 doi: 10.48550/arXiv.1802.02611 – ident: ref58 doi: 10.1109/TNNLS.2022.3221745 – ident: ref7 doi: 10.1109/TITS.2022.3150350 – ident: ref33 doi: 10.1007/s11263-015-0816-y – ident: ref34 doi: 10.1109/TMM.2021.3088639 – ident: ref50 doi: 10.1109/WACV48630.2021.00360 – ident: ref16 doi: 10.1109/ICIP.2019.8803154 – ident: ref35 doi: 10.1109/CVPR.2019.00271 – ident: ref4 doi: 10.1109/TITS.2022.3182311 – ident: ref11 doi: 10.1109/CVPR.2016.90 – ident: ref1 doi: 10.1016/j.neucom.2022.11.094 – ident: ref8 doi: 10.1109/CVPR.2017.660 – start-page: 1 volume-title: Proc. Brit. Mach. Vis. Conf. ident: ref59 article-title: Real-time semantic segmentation via multiply spatial fusion network – start-page: 1 volume-title: Proc. Brit. Mach. Vis. Conf. ident: ref32 article-title: Fast-SCNN: Fast semantic segmentation network – ident: ref61 doi: 10.1109/CVPR42600.2020.00426 – year: 2023 ident: ref60 article-title: SeaFormer: Squeeze-enhanced axial transformer for mobile semantic segmentation publication-title: arXiv:2301.13156
SSID	ssj0014511
Score	2.427265
Snippet	Real-time semantic segmentation of street scenes is an essential and challenging task for autonomous driving systems, which needs to achieve both high accuracy...
SourceID	proquest crossref ieee
SourceType	Aggregation Database Enrichment Source Index Database Publisher
StartPage	3575
SubjectTerms	Accuracy Autonomous driving Context Context modeling Convolutional neural networks Encoding Feature detection Feature maps Image analysis Modules Pixels Real time real-time semantic segmentation Real-time systems Redundancy Road traffic selective context encoding selective feature fusion Semantic segmentation Semantics Street scene understanding Test sets Transformers
Title	Exploring Scale-Aware Features for Real-Time Semantic Segmentation of Street Scenes
URI	https://ieeexplore.ieee.org/document/10315052 https://www.proquest.com/docview/3055167549
Volume	25
hasFullText	1
inHoldings	1
isFullTextHit
isPrint
link	http://utb.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwjV1LSwMxEA7akx58VqxWycGTkO0-knVzFFGqoAdbobclm0xFtK3ULYK_3pnstlRF8ZZDEsJMdjLfzsw3jJ0Mi8w4naRCImwTMjJK6Ji6pqYG3eXMpokvj769S7sP8magBnWxuq-FAQCffAYBDX0s303sjH6VdXxLglChxV1F5FYVay1CBkS05clRYyl0qOYhzCjUnf51vxdQn_AgIfiusy-PkO-q8sMU-_flapPdzU9WpZU8B7OyCOzHN9LGfx99i23UniY_r67GNluB8Q5bX-If3GW9RQYe76GuQJy_mylwcgtnCMM5OrT8Hj1JQYUivAcjVMOTxcHjqC5ZGvPJkFeRbdyC7GaTPVxd9i-6ou6yIGysZSkI4IGLTFgoY0MTx5C5JFMmPSMoqAyoDBCTod9RQDIs7NkwdmheXaas0llokj3WGE_GsM-4ktI5GVHkzUhtVaEVRMqCItr7RLoWC-diz21NQU6dMF5yD0VCnZOmctJUXmuqxU4XS14r_o2_JjdJ8ksTK6G3WHuu3Lz-RN9yojqLEC5JffDLskO2hrvLKr2xzRrldAZH6IKUxbG_ep_lq9S8
linkProvider	IEEE
linkToHtml	http://utb.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwjV1LT8MwDI4QHIADb8R45sAJKaWPpDRHhJgGjB3YkHar0sRDCLYh2ITEr8dOu2mAQNxySNooTm1_tf2ZseNekRmnk1RIhG1CRkYJHVPX1NSgu5zZNPHl0bettHEvr7uqWxWr-1oYAPDJZxDQ0Mfy3dCO6VfZqW9JECrUuAto-FVUlmtNgwZEteXpUWMpdKgmQcwo1Kedq047oE7hQUIAXmdfzJDvq_JDGXsLU19lrcneysSSp2A8KgL78Y228d-bX2Mrla_Jz8vLsc7mYLDBlmcYCDdZe5qDx9soLRDn7-YVODmGYwTiHF1afoe-pKBSEd6GPgri0eLgoV8VLQ34sMfL2DY-gjTnFruvX3YuGqLqsyBsrOVIEMQDF5mwUMaGJo4hc0mmTHpGYFAZUBkgKkPPo4CkV9izXuxQwbpMWaWz0CTbbH4wHMAO40pK52REsTcjtVWFVhApC4qI7xPpaiycHHtuKxJy6oXxnHswEuqcJJWTpPJKUjV2Ml3yUjJw_DV5i05-ZmJ56DW2PxFuXn2kbzmRnUUImKTe_WXZEVtsdG6befOqdbPHlvBNskx23Gfzo9cxHKBDMioO_TX8BCtq2AU
openUrl	ctx_ver=Z39.88-2004&ctx_enc=info%3Aofi%2Fenc%3AUTF-8&rfr_id=info%3Asid%2Fsummon.serialssolutions.com&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Ajournal&rft.genre=article&rft.atitle=Exploring+Scale-Aware+Features+for+Real-Time+Semantic+Segmentation+of+Street+Scenes&rft.jtitle=IEEE+transactions+on+intelligent+transportation+systems&rft.au=Li%2C+Kaige&rft.au=Geng%2C+Qichuan&rft.au=Zhou%2C+Zhong&rft.date=2024-05-01&rft.pub=The+Institute+of+Electrical+and+Electronics+Engineers%2C+Inc.+%28IEEE%29&rft.issn=1524-9050&rft.eissn=1558-0016&rft.volume=25&rft.issue=5&rft.spage=3575&rft_id=info:doi/10.1109%2FTITS.2023.3330498&rft.externalDBID=NO_FULL_TEXT
thumbnail_l	http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/lc.gif&issn=1524-9050&client=summon
thumbnail_m	http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/mc.gif&issn=1524-9050&client=summon
thumbnail_s	http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/sc.gif&issn=1524-9050&client=summon