Out-of-Distribution Semantic Segmentation with Disentangled and Calibrated Representation

Out-of-distribution (OoD) semantic segmentation aims to recognize pixels of classes undefined in the training dataset. Existing methods mostly focus on training the model to fit real OoD data samples to identify OoD pixels, which requires extra data collection and annotation efforts. By contrast, sy...

Full description

Saved in:
Bibliographic Details
Published inIEEE transactions on circuits and systems for video technology p. 1
Main Authors Wan, Maoxian, Li, Kaige, Geng, Qichuan, Su, Binyi, Zhou, Zhong
Format Journal Article
LanguageEnglish
Published IEEE 2025
Subjects
Online AccessGet full text

Cover

Loading…
Abstract Out-of-distribution (OoD) semantic segmentation aims to recognize pixels of classes undefined in the training dataset. Existing methods mostly focus on training the model to fit real OoD data samples to identify OoD pixels, which requires extra data collection and annotation efforts. By contrast, synthesizing OoD data with training data provides a more resource-efficient alternative. However, synthetic data generated from controlled settings lacks diversity, causing the model to suffer from overfitting. To this end, we propose a disentangled representation learning (DRL) method to guide the model to disentangle semantic-related and semantic-unrelated features from synthetic OoD data. DRL encourages the model to utilize the former to identify semantic categories, rather than overfitting to such semantic-unrelated features as synthetic artificiality. Specifically, DRL first incorporates two disentanglers to extract the semantic-related and -unrelated features and then applies a shuffle and reconstruction mechanism to regularize the disentangled features. Furthermore, to facilitate disentangling, we propose a pixel-wise feature similarity calibration (PSC) module, which utilizes more accurate ID-OoD similarity to calibrate inaccurate ID-OoD similarity learned exclusively from ID data. Thus, PSC delivers accurate and stable pixel-wise features for effective disentangling. Extensive experiments illustrate that the proposed method exhibits strong generalization ability. It attains 74.04% AuPRC and 20.82% FPR on Road Anomaly, 69.85% AuPRC and 5.78% FPR on Fishyscapes LostAndFound Validation Set, using SegFormer with the MiT-B5 backbone. Source code is available at https://github.com/WanMotion/DisentangledOoDSeg.
AbstractList Out-of-distribution (OoD) semantic segmentation aims to recognize pixels of classes undefined in the training dataset. Existing methods mostly focus on training the model to fit real OoD data samples to identify OoD pixels, which requires extra data collection and annotation efforts. By contrast, synthesizing OoD data with training data provides a more resource-efficient alternative. However, synthetic data generated from controlled settings lacks diversity, causing the model to suffer from overfitting. To this end, we propose a disentangled representation learning (DRL) method to guide the model to disentangle semantic-related and semantic-unrelated features from synthetic OoD data. DRL encourages the model to utilize the former to identify semantic categories, rather than overfitting to such semantic-unrelated features as synthetic artificiality. Specifically, DRL first incorporates two disentanglers to extract the semantic-related and -unrelated features and then applies a shuffle and reconstruction mechanism to regularize the disentangled features. Furthermore, to facilitate disentangling, we propose a pixel-wise feature similarity calibration (PSC) module, which utilizes more accurate ID-OoD similarity to calibrate inaccurate ID-OoD similarity learned exclusively from ID data. Thus, PSC delivers accurate and stable pixel-wise features for effective disentangling. Extensive experiments illustrate that the proposed method exhibits strong generalization ability. It attains 74.04% AuPRC and 20.82% FPR on Road Anomaly, 69.85% AuPRC and 5.78% FPR on Fishyscapes LostAndFound Validation Set, using SegFormer with the MiT-B5 backbone. Source code is available at https://github.com/WanMotion/DisentangledOoDSeg.
Author Li, Kaige
Wan, Maoxian
Geng, Qichuan
Su, Binyi
Zhou, Zhong
Author_xml – sequence: 1
  givenname: Maoxian
  orcidid: 0009-0000-5396-0185
  surname: Wan
  fullname: Wan, Maoxian
  email: wanmaoxian@buaa.edu.cn
  organization: State Key Laboratory of Virtual Reality Technology and Systems, Beihang University, Beijing, China
– sequence: 2
  givenname: Kaige
  orcidid: 0000-0002-1716-4381
  surname: Li
  fullname: Li, Kaige
  email: lkg@buaa.edu.cn
  organization: State Key Laboratory of Virtual Reality Technology and Systems, Beihang University, Beijing, China
– sequence: 3
  givenname: Qichuan
  orcidid: 0000-0002-0046-5794
  surname: Geng
  fullname: Geng, Qichuan
  email: gengqichuan1989@cnu.edu.cn
  organization: Capital Normal University, Beijing, China
– sequence: 4
  givenname: Binyi
  orcidid: 0000-0002-8024-347X
  surname: Su
  fullname: Su, Binyi
  email: subinyi@buaa.edu.cn
  organization: State Key Laboratory of Virtual Reality Technology and Systems, Beihang University, Beijing, China
– sequence: 5
  givenname: Zhong
  orcidid: 0000-0002-5825-7517
  surname: Zhou
  fullname: Zhou, Zhong
  email: zz@buaa.edu.cn
  organization: State Key Laboratory of Virtual Reality Technology and Systems, Beihang University, Beijing, China
BookMark eNpFkEtLAzEUhYNUsK3-AXExfyA1N2nmzixlfEKhYKvgasjkUSNtpkxSxH_v9CGu7rmHc87iG5FBaIMl5BrYBICVt8tq8b6ccMblRMgSGcIZGYKUBeWcyUGvmQRacJAXZBTjF2MwLaY4JB_zXaKto_c-ps43u-TbkC3sRoXkdS9WGxuSOrjfPn1mfW5vhNXamkwFk1Vq7ZtOpf59tdvOxr_8JTl3ah3t1emOydvjw7J6prP500t1N6MaRJ6oEo41U1eiQS4Kl_NGsAaNkqhBYt6UVnMDvaWxFA65QasEIEpk2pi8EGPCj7u6a2PsrKu3nd-o7qcGVu_h1Ac49R5OfYLTl26OJW-t_S8AcBCSiV9rrGR8
CODEN ITCTEM
ContentType Journal Article
DBID 97E
RIA
RIE
AAYXX
CITATION
DOI 10.1109/TCSVT.2025.3597071
DatabaseName IEEE All-Society Periodicals Package (ASPP) 2005–Present
IEEE All-Society Periodicals Package (ASPP) 1998–Present
IEEE Electronic Library (IEL)
CrossRef
DatabaseTitle CrossRef
DatabaseTitleList
Database_xml – sequence: 1
  dbid: RIE
  name: IEEE Electronic Library (IEL)
  url: https://proxy.k.utb.cz/login?url=https://ieeexplore.ieee.org/
  sourceTypes: Publisher
DeliveryMethod fulltext_linktorsrc
Discipline Engineering
EISSN 1558-2205
EndPage 1
ExternalDocumentID 10_1109_TCSVT_2025_3597071
11121350
Genre orig-research
GrantInformation_xml – fundername: Science and Technology Project of Hainan Provincial Department of Transportation
  grantid: HNJTT-KXC-2024-3-22-02
– fundername: National Natural Science Foundation of China
  grantid: 62206184; 62272018
  funderid: 10.13039/501100001809
GroupedDBID -~X
0R~
29I
4.4
5GY
6IK
97E
AAJGR
AARMG
AASAJ
AAWTH
ABAZT
ABQJQ
ABVLG
ACGFO
ACGFS
ACIWK
AENEX
AGQYO
AHBIQ
AKJIK
AKQYR
ALMA_UNASSIGNED_HOLDINGS
ASUFR
ATWAV
BEFXN
BFFAM
BGNUA
BKEBE
BPEOZ
CS3
DU5
EBS
HZ~
IFIPE
IPLJI
JAVBF
LAI
M43
O9-
OCL
P2P
RIA
RIE
RNS
RXW
TAE
TN5
5VS
AAYXX
AETIX
AGSQL
AI.
AIBXA
ALLEH
CITATION
EJD
H~9
ICLAB
IFJZH
RIG
VH1
ID FETCH-LOGICAL-c136t-a3f0b4f97d7238f62b30b7da57c1576b9ec2d10b7c793f72d7ea3177570cdd683
IEDL.DBID RIE
ISSN 1051-8215
IngestDate Thu Aug 14 00:18:40 EDT 2025
Wed Aug 27 01:43:18 EDT 2025
IsPeerReviewed true
IsScholarly true
Language English
License https://ieeexplore.ieee.org/Xplorehelp/downloads/license-information/IEEE.html
https://doi.org/10.15223/policy-029
https://doi.org/10.15223/policy-037
LinkModel DirectLink
MergedId FETCHMERGED-LOGICAL-c136t-a3f0b4f97d7238f62b30b7da57c1576b9ec2d10b7c793f72d7ea3177570cdd683
ORCID 0000-0002-1716-4381
0000-0002-0046-5794
0009-0000-5396-0185
0000-0002-5825-7517
0000-0002-8024-347X
PageCount 1
ParticipantIDs crossref_primary_10_1109_TCSVT_2025_3597071
ieee_primary_11121350
PublicationCentury 2000
PublicationDate 2025-00-00
PublicationDateYYYYMMDD 2025-01-01
PublicationDate_xml – year: 2025
  text: 2025-00-00
PublicationDecade 2020
PublicationTitle IEEE transactions on circuits and systems for video technology
PublicationTitleAbbrev TCSVT
PublicationYear 2025
Publisher IEEE
Publisher_xml – name: IEEE
SSID ssj0014847
Score 2.4562712
Snippet Out-of-distribution (OoD) semantic segmentation aims to recognize pixels of classes undefined in the training dataset. Existing methods mostly focus on...
SourceID crossref
ieee
SourceType Index Database
Publisher
StartPage 1
SubjectTerms Accuracy
Data models
disentangled representations
Image reconstruction
out-of-distribution
Overfitting
Semantic segmentation
Shape
Synthetic data
Training
Training data
Uncertainty
Title Out-of-Distribution Semantic Segmentation with Disentangled and Calibrated Representation
URI https://ieeexplore.ieee.org/document/11121350
hasFullText 1
inHoldings 1
isFullTextHit
isPrint
link http://utb.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwjV3dS8MwEA-6J33wc-L8og--SbqmX2kfZTpEcILbZD6VJrkM0XUi7Yt_vZe01SkIvoXjSsNdwn3kfneEnEMoPMmBUzQPKQ1lpGnqM03R2CRahl6e2w58d6P4ZhrezqJZA1a3WBgAsMVn4JqlfctXS1mZVFkf76XPAhOhr2PkVoO1vp4MwsROE0N_geGfWNQiZLy0PxmMHycYC_qRG6AD7XH2wwqtjFWxVmW4TUbtfupikhe3KoUrP361avz3hnfIVuNfOpf1gdgla1Dskc2VroP75Om-KulS0yvTM7cZd-WMYYEyfpa4mC8aPFLhmCytg3yGUMxfQTl5oRyD5xKmxYRyHmwhbcvfJdPh9WRwQ5sJC1SyIC5pHmhPhDrlyswe07EvAk9wlUdcMgxERArSVwxJEq-x5r7ikKPDwSPuSaXiJDggnWJZwCFxAFJkYgoCg87VLOdxGkcy1mGSKy2gRy5aiWdvdSONzAYgXppZ_WRGP1mjnx7pGml-czaCPPqDfkw2zOd1buSEdMr3Ck7RWyjFmT0lnwtyvNg
linkProvider IEEE
linkToHtml http://utb.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwzV1LTxUxFD5BXCgLfGFAfHShK9NL25lO7yxcGJBc5GEiF4OrcdqeEgLMJTI3RP-Lf8Xf5mlnLqKJSxJ3zclJk_Z8zXn0PABeYm6FM2g4qYeS504HXioZOCmbYXC5qOvUgW93rxgd5O8P9eEc_LiqhUHElHyGg7hMf_l-4qYxVLZG71LJTIs-h3Ibv12Sh3bxZmuDxPlKqc134_UR74cIcCezouV1FoTNQ2l8HK8VCmUzYY2vtXGSbG1bolNeEskRUoNR3mBNOtVoI5z3xTCjfW_BbTI0tOrKw64-KfJhml9GFoqks0k9q8kR5dp4ff_TmLxPpQcZmezCyD_03rVBLkmPbd6Dn7Mb6NJXTgbT1g7c97-aQ_63V3QfFnsLmr3tIP8A5rB5CAvX-io-gs8fpi2fBL4RuwL3A73YPp4Rio4dLY7O-oqrhsU4NCO-SGiOTtGzuvEsVqzZ2ETDs48pVXjGvwQHN3K2xzDfTBpcBoZYEpP0mMX64yBrU5SFdkXIh7UPFlfg9UzC1XnXKqRKLpYoq4SHKuKh6vGwAktRer85e8E9-Qf9BdwZjXd3qp2tve1VuBu36iJBT2G-_TrFZ2QbtfZ5QiiDLzct719GURow
openUrl ctx_ver=Z39.88-2004&ctx_enc=info%3Aofi%2Fenc%3AUTF-8&rfr_id=info%3Asid%2Fsummon.serialssolutions.com&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Ajournal&rft.genre=article&rft.atitle=Out-of-Distribution+Semantic+Segmentation+with+Disentangled+and+Calibrated+Representation&rft.jtitle=IEEE+transactions+on+circuits+and+systems+for+video+technology&rft.au=Wan%2C+Maoxian&rft.au=Li%2C+Kaige&rft.au=Geng%2C+Qichuan&rft.au=Su%2C+Binyi&rft.date=2025&rft.issn=1051-8215&rft.eissn=1558-2205&rft.spage=1&rft.epage=1&rft_id=info:doi/10.1109%2FTCSVT.2025.3597071&rft.externalDBID=n%2Fa&rft.externalDocID=10_1109_TCSVT_2025_3597071
thumbnail_l http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/lc.gif&issn=1051-8215&client=summon
thumbnail_m http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/mc.gif&issn=1051-8215&client=summon
thumbnail_s http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/sc.gif&issn=1051-8215&client=summon