Out-of-Distribution Semantic Segmentation with Disentangled and Calibrated Representation

Out-of-distribution (OoD) semantic segmentation aims to recognize pixels of classes undefined in the training dataset. Existing methods mostly focus on training the model to fit real OoD data samples to identify OoD pixels, which requires extra data collection and annotation efforts. By contrast, sy...

Full description

Saved in:

Bibliographic Details
Published in	IEEE transactions on circuits and systems for video technology p. 1
Main Authors	Wan, Maoxian, Li, Kaige, Geng, Qichuan, Su, Binyi, Zhou, Zhong
Format	Journal Article
Language	English
Published	IEEE 2025
Subjects	Accuracy Data models disentangled representations Image reconstruction out-of-distribution Overfitting Semantic segmentation Shape Synthetic data Training Training data Uncertainty
Online Access	Get full text

Cover

Loading…

Abstract	Out-of-distribution (OoD) semantic segmentation aims to recognize pixels of classes undefined in the training dataset. Existing methods mostly focus on training the model to fit real OoD data samples to identify OoD pixels, which requires extra data collection and annotation efforts. By contrast, synthesizing OoD data with training data provides a more resource-efficient alternative. However, synthetic data generated from controlled settings lacks diversity, causing the model to suffer from overfitting. To this end, we propose a disentangled representation learning (DRL) method to guide the model to disentangle semantic-related and semantic-unrelated features from synthetic OoD data. DRL encourages the model to utilize the former to identify semantic categories, rather than overfitting to such semantic-unrelated features as synthetic artificiality. Specifically, DRL first incorporates two disentanglers to extract the semantic-related and -unrelated features and then applies a shuffle and reconstruction mechanism to regularize the disentangled features. Furthermore, to facilitate disentangling, we propose a pixel-wise feature similarity calibration (PSC) module, which utilizes more accurate ID-OoD similarity to calibrate inaccurate ID-OoD similarity learned exclusively from ID data. Thus, PSC delivers accurate and stable pixel-wise features for effective disentangling. Extensive experiments illustrate that the proposed method exhibits strong generalization ability. It attains 74.04% AuPRC and 20.82% FPR on Road Anomaly, 69.85% AuPRC and 5.78% FPR on Fishyscapes LostAndFound Validation Set, using SegFormer with the MiT-B5 backbone. Source code is available at https://github.com/WanMotion/DisentangledOoDSeg.
AbstractList	Out-of-distribution (OoD) semantic segmentation aims to recognize pixels of classes undefined in the training dataset. Existing methods mostly focus on training the model to fit real OoD data samples to identify OoD pixels, which requires extra data collection and annotation efforts. By contrast, synthesizing OoD data with training data provides a more resource-efficient alternative. However, synthetic data generated from controlled settings lacks diversity, causing the model to suffer from overfitting. To this end, we propose a disentangled representation learning (DRL) method to guide the model to disentangle semantic-related and semantic-unrelated features from synthetic OoD data. DRL encourages the model to utilize the former to identify semantic categories, rather than overfitting to such semantic-unrelated features as synthetic artificiality. Specifically, DRL first incorporates two disentanglers to extract the semantic-related and -unrelated features and then applies a shuffle and reconstruction mechanism to regularize the disentangled features. Furthermore, to facilitate disentangling, we propose a pixel-wise feature similarity calibration (PSC) module, which utilizes more accurate ID-OoD similarity to calibrate inaccurate ID-OoD similarity learned exclusively from ID data. Thus, PSC delivers accurate and stable pixel-wise features for effective disentangling. Extensive experiments illustrate that the proposed method exhibits strong generalization ability. It attains 74.04% AuPRC and 20.82% FPR on Road Anomaly, 69.85% AuPRC and 5.78% FPR on Fishyscapes LostAndFound Validation Set, using SegFormer with the MiT-B5 backbone. Source code is available at https://github.com/WanMotion/DisentangledOoDSeg.
Author	Li, Kaige Wan, Maoxian Geng, Qichuan Su, Binyi Zhou, Zhong
Author_xml	– sequence: 1 givenname: Maoxian orcidid: 0009-0000-5396-0185 surname: Wan fullname: Wan, Maoxian email: wanmaoxian@buaa.edu.cn organization: State Key Laboratory of Virtual Reality Technology and Systems, Beihang University, Beijing, China – sequence: 2 givenname: Kaige orcidid: 0000-0002-1716-4381 surname: Li fullname: Li, Kaige email: lkg@buaa.edu.cn organization: State Key Laboratory of Virtual Reality Technology and Systems, Beihang University, Beijing, China – sequence: 3 givenname: Qichuan orcidid: 0000-0002-0046-5794 surname: Geng fullname: Geng, Qichuan email: gengqichuan1989@cnu.edu.cn organization: Capital Normal University, Beijing, China – sequence: 4 givenname: Binyi orcidid: 0000-0002-8024-347X surname: Su fullname: Su, Binyi email: subinyi@buaa.edu.cn organization: State Key Laboratory of Virtual Reality Technology and Systems, Beihang University, Beijing, China – sequence: 5 givenname: Zhong orcidid: 0000-0002-5825-7517 surname: Zhou fullname: Zhou, Zhong email: zz@buaa.edu.cn organization: State Key Laboratory of Virtual Reality Technology and Systems, Beihang University, Beijing, China
BookMark	eNpFkEtLAzEUhYNUsK3-AXExfyA1N2nmzixlfEKhYKvgasjkUSNtpkxSxH_v9CGu7rmHc87iG5FBaIMl5BrYBICVt8tq8b6ccMblRMgSGcIZGYKUBeWcyUGvmQRacJAXZBTjF2MwLaY4JB_zXaKto_c-ps43u-TbkC3sRoXkdS9WGxuSOrjfPn1mfW5vhNXamkwFk1Vq7ZtOpf59tdvOxr_8JTl3ah3t1emOydvjw7J6prP500t1N6MaRJ6oEo41U1eiQS4Kl_NGsAaNkqhBYt6UVnMDvaWxFA65QasEIEpk2pi8EGPCj7u6a2PsrKu3nd-o7qcGVu_h1Ac49R5OfYLTl26OJW-t_S8AcBCSiV9rrGR8
CODEN	ITCTEM
ContentType	Journal Article
DBID	97E RIA RIE AAYXX CITATION
DOI	10.1109/TCSVT.2025.3597071
DatabaseName	IEEE All-Society Periodicals Package (ASPP) 2005–Present IEEE All-Society Periodicals Package (ASPP) 1998–Present IEEE Electronic Library (IEL) CrossRef
DatabaseTitle	CrossRef
DatabaseTitleList
Database_xml	– sequence: 1 dbid: RIE name: IEEE Electronic Library (IEL) url: https://proxy.k.utb.cz/login?url=https://ieeexplore.ieee.org/ sourceTypes: Publisher
DeliveryMethod	fulltext_linktorsrc
Discipline	Engineering
EISSN	1558-2205
EndPage	1
ExternalDocumentID	10_1109_TCSVT_2025_3597071 11121350
Genre	orig-research
GrantInformation_xml	– fundername: Science and Technology Project of Hainan Provincial Department of Transportation grantid: HNJTT-KXC-2024-3-22-02 – fundername: National Natural Science Foundation of China grantid: 62206184; 62272018 funderid: 10.13039/501100001809
GroupedDBID	-~X 0R~ 29I 4.4 5GY 6IK 97E AAJGR AARMG AASAJ AAWTH ABAZT ABQJQ ABVLG ACGFO ACGFS ACIWK AENEX AGQYO AHBIQ AKJIK AKQYR ALMA_UNASSIGNED_HOLDINGS ASUFR ATWAV BEFXN BFFAM BGNUA BKEBE BPEOZ CS3 DU5 EBS HZ~ IFIPE IPLJI JAVBF LAI M43 O9- OCL P2P RIA RIE RNS RXW TAE TN5 5VS AAYXX AETIX AGSQL AI. AIBXA ALLEH CITATION EJD H~9 ICLAB IFJZH RIG VH1
ID	FETCH-LOGICAL-c136t-a3f0b4f97d7238f62b30b7da57c1576b9ec2d10b7c793f72d7ea3177570cdd683
IEDL.DBID	RIE
ISSN	1051-8215
IngestDate	Thu Aug 14 00:18:40 EDT 2025 Wed Aug 27 01:43:18 EDT 2025
IsPeerReviewed	true
IsScholarly	true
Language	English
License	https://ieeexplore.ieee.org/Xplorehelp/downloads/license-information/IEEE.html https://doi.org/10.15223/policy-029 https://doi.org/10.15223/policy-037
LinkModel	DirectLink
MergedId	FETCHMERGED-LOGICAL-c136t-a3f0b4f97d7238f62b30b7da57c1576b9ec2d10b7c793f72d7ea3177570cdd683
ORCID	0000-0002-1716-4381 0000-0002-0046-5794 0009-0000-5396-0185 0000-0002-5825-7517 0000-0002-8024-347X
PageCount	1
ParticipantIDs	crossref_primary_10_1109_TCSVT_2025_3597071 ieee_primary_11121350
PublicationCentury	2000
PublicationDate	2025-00-00
PublicationDateYYYYMMDD	2025-01-01
PublicationDate_xml	– year: 2025 text: 2025-00-00
PublicationDecade	2020
PublicationTitle	IEEE transactions on circuits and systems for video technology
PublicationTitleAbbrev	TCSVT
PublicationYear	2025
Publisher	IEEE
Publisher_xml	– name: IEEE
SSID	ssj0014847
Score	2.4562712
Snippet	Out-of-distribution (OoD) semantic segmentation aims to recognize pixels of classes undefined in the training dataset. Existing methods mostly focus on...
SourceID	crossref ieee
SourceType	Index Database Publisher
StartPage	1
SubjectTerms	Accuracy Data models disentangled representations Image reconstruction out-of-distribution Overfitting Semantic segmentation Shape Synthetic data Training Training data Uncertainty
Title	Out-of-Distribution Semantic Segmentation with Disentangled and Calibrated Representation
URI	https://ieeexplore.ieee.org/document/11121350
hasFullText	1
inHoldings	1
isFullTextHit
isPrint
link	http://utb.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwjV3dS8MwEA-6J33wc-L8og--SbqmX2kfZTpEcILbZD6VJrkM0XUi7Yt_vZe01SkIvoXjSsNdwn3kfneEnEMoPMmBUzQPKQ1lpGnqM03R2CRahl6e2w58d6P4ZhrezqJZA1a3WBgAsMVn4JqlfctXS1mZVFkf76XPAhOhr2PkVoO1vp4MwsROE0N_geGfWNQiZLy0PxmMHycYC_qRG6AD7XH2wwqtjFWxVmW4TUbtfupikhe3KoUrP361avz3hnfIVuNfOpf1gdgla1Dskc2VroP75Om-KulS0yvTM7cZd-WMYYEyfpa4mC8aPFLhmCytg3yGUMxfQTl5oRyD5xKmxYRyHmwhbcvfJdPh9WRwQ5sJC1SyIC5pHmhPhDrlyswe07EvAk9wlUdcMgxERArSVwxJEq-x5r7ikKPDwSPuSaXiJDggnWJZwCFxAFJkYgoCg87VLOdxGkcy1mGSKy2gRy5aiWdvdSONzAYgXppZ_WRGP1mjnx7pGml-czaCPPqDfkw2zOd1buSEdMr3Ck7RWyjFmT0lnwtyvNg
linkProvider	IEEE
linkToHtml	http://utb.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwzV1LTxUxFD5BXCgLfGFAfHShK9NL25lO7yxcGJBc5GEiF4OrcdqeEgLMJTI3RP-Lf8Xf5mlnLqKJSxJ3zclJk_Z8zXn0PABeYm6FM2g4qYeS504HXioZOCmbYXC5qOvUgW93rxgd5O8P9eEc_LiqhUHElHyGg7hMf_l-4qYxVLZG71LJTIs-h3Ibv12Sh3bxZmuDxPlKqc134_UR74cIcCezouV1FoTNQ2l8HK8VCmUzYY2vtXGSbG1bolNeEskRUoNR3mBNOtVoI5z3xTCjfW_BbTI0tOrKw64-KfJhml9GFoqks0k9q8kR5dp4ff_TmLxPpQcZmezCyD_03rVBLkmPbd6Dn7Mb6NJXTgbT1g7c97-aQ_63V3QfFnsLmr3tIP8A5rB5CAvX-io-gs8fpi2fBL4RuwL3A73YPp4Rio4dLY7O-oqrhsU4NCO-SGiOTtGzuvEsVqzZ2ETDs48pVXjGvwQHN3K2xzDfTBpcBoZYEpP0mMX64yBrU5SFdkXIh7UPFlfg9UzC1XnXKqRKLpYoq4SHKuKh6vGwAktRer85e8E9-Qf9BdwZjXd3qp2tve1VuBu36iJBT2G-_TrFZ2QbtfZ5QiiDLzct719GURow
openUrl	ctx_ver=Z39.88-2004&ctx_enc=info%3Aofi%2Fenc%3AUTF-8&rfr_id=info%3Asid%2Fsummon.serialssolutions.com&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Ajournal&rft.genre=article&rft.atitle=Out-of-Distribution+Semantic+Segmentation+with+Disentangled+and+Calibrated+Representation&rft.jtitle=IEEE+transactions+on+circuits+and+systems+for+video+technology&rft.au=Wan%2C+Maoxian&rft.au=Li%2C+Kaige&rft.au=Geng%2C+Qichuan&rft.au=Su%2C+Binyi&rft.date=2025&rft.issn=1051-8215&rft.eissn=1558-2205&rft.spage=1&rft.epage=1&rft_id=info:doi/10.1109%2FTCSVT.2025.3597071&rft.externalDBID=n%2Fa&rft.externalDocID=10_1109_TCSVT_2025_3597071
thumbnail_l	http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/lc.gif&issn=1051-8215&client=summon
thumbnail_m	http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/mc.gif&issn=1051-8215&client=summon
thumbnail_s	http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/sc.gif&issn=1051-8215&client=summon