Out-of-Distribution Semantic Segmentation with Disentangled and Calibrated Representation
Out-of-distribution (OoD) semantic segmentation aims to recognize pixels of classes undefined in the training dataset. Existing methods mostly focus on training the model to fit real OoD data samples to identify OoD pixels, which requires extra data collection and annotation efforts. By contrast, sy...
Saved in:
Published in | IEEE transactions on circuits and systems for video technology p. 1 |
---|---|
Main Authors | , , , , |
Format | Journal Article |
Language | English |
Published |
IEEE
2025
|
Subjects | |
Online Access | Get full text |
Cover
Loading…
Abstract | Out-of-distribution (OoD) semantic segmentation aims to recognize pixels of classes undefined in the training dataset. Existing methods mostly focus on training the model to fit real OoD data samples to identify OoD pixels, which requires extra data collection and annotation efforts. By contrast, synthesizing OoD data with training data provides a more resource-efficient alternative. However, synthetic data generated from controlled settings lacks diversity, causing the model to suffer from overfitting. To this end, we propose a disentangled representation learning (DRL) method to guide the model to disentangle semantic-related and semantic-unrelated features from synthetic OoD data. DRL encourages the model to utilize the former to identify semantic categories, rather than overfitting to such semantic-unrelated features as synthetic artificiality. Specifically, DRL first incorporates two disentanglers to extract the semantic-related and -unrelated features and then applies a shuffle and reconstruction mechanism to regularize the disentangled features. Furthermore, to facilitate disentangling, we propose a pixel-wise feature similarity calibration (PSC) module, which utilizes more accurate ID-OoD similarity to calibrate inaccurate ID-OoD similarity learned exclusively from ID data. Thus, PSC delivers accurate and stable pixel-wise features for effective disentangling. Extensive experiments illustrate that the proposed method exhibits strong generalization ability. It attains 74.04% AuPRC and 20.82% FPR on Road Anomaly, 69.85% AuPRC and 5.78% FPR on Fishyscapes LostAndFound Validation Set, using SegFormer with the MiT-B5 backbone. Source code is available at https://github.com/WanMotion/DisentangledOoDSeg. |
---|---|
AbstractList | Out-of-distribution (OoD) semantic segmentation aims to recognize pixels of classes undefined in the training dataset. Existing methods mostly focus on training the model to fit real OoD data samples to identify OoD pixels, which requires extra data collection and annotation efforts. By contrast, synthesizing OoD data with training data provides a more resource-efficient alternative. However, synthetic data generated from controlled settings lacks diversity, causing the model to suffer from overfitting. To this end, we propose a disentangled representation learning (DRL) method to guide the model to disentangle semantic-related and semantic-unrelated features from synthetic OoD data. DRL encourages the model to utilize the former to identify semantic categories, rather than overfitting to such semantic-unrelated features as synthetic artificiality. Specifically, DRL first incorporates two disentanglers to extract the semantic-related and -unrelated features and then applies a shuffle and reconstruction mechanism to regularize the disentangled features. Furthermore, to facilitate disentangling, we propose a pixel-wise feature similarity calibration (PSC) module, which utilizes more accurate ID-OoD similarity to calibrate inaccurate ID-OoD similarity learned exclusively from ID data. Thus, PSC delivers accurate and stable pixel-wise features for effective disentangling. Extensive experiments illustrate that the proposed method exhibits strong generalization ability. It attains 74.04% AuPRC and 20.82% FPR on Road Anomaly, 69.85% AuPRC and 5.78% FPR on Fishyscapes LostAndFound Validation Set, using SegFormer with the MiT-B5 backbone. Source code is available at https://github.com/WanMotion/DisentangledOoDSeg. |
Author | Li, Kaige Wan, Maoxian Geng, Qichuan Su, Binyi Zhou, Zhong |
Author_xml | – sequence: 1 givenname: Maoxian orcidid: 0009-0000-5396-0185 surname: Wan fullname: Wan, Maoxian email: wanmaoxian@buaa.edu.cn organization: State Key Laboratory of Virtual Reality Technology and Systems, Beihang University, Beijing, China – sequence: 2 givenname: Kaige orcidid: 0000-0002-1716-4381 surname: Li fullname: Li, Kaige email: lkg@buaa.edu.cn organization: State Key Laboratory of Virtual Reality Technology and Systems, Beihang University, Beijing, China – sequence: 3 givenname: Qichuan orcidid: 0000-0002-0046-5794 surname: Geng fullname: Geng, Qichuan email: gengqichuan1989@cnu.edu.cn organization: Capital Normal University, Beijing, China – sequence: 4 givenname: Binyi orcidid: 0000-0002-8024-347X surname: Su fullname: Su, Binyi email: subinyi@buaa.edu.cn organization: State Key Laboratory of Virtual Reality Technology and Systems, Beihang University, Beijing, China – sequence: 5 givenname: Zhong orcidid: 0000-0002-5825-7517 surname: Zhou fullname: Zhou, Zhong email: zz@buaa.edu.cn organization: State Key Laboratory of Virtual Reality Technology and Systems, Beihang University, Beijing, China |
BookMark | eNpFkEtLAzEUhYNUsK3-AXExfyA1N2nmzixlfEKhYKvgasjkUSNtpkxSxH_v9CGu7rmHc87iG5FBaIMl5BrYBICVt8tq8b6ccMblRMgSGcIZGYKUBeWcyUGvmQRacJAXZBTjF2MwLaY4JB_zXaKto_c-ps43u-TbkC3sRoXkdS9WGxuSOrjfPn1mfW5vhNXamkwFk1Vq7ZtOpf59tdvOxr_8JTl3ah3t1emOydvjw7J6prP500t1N6MaRJ6oEo41U1eiQS4Kl_NGsAaNkqhBYt6UVnMDvaWxFA65QasEIEpk2pi8EGPCj7u6a2PsrKu3nd-o7qcGVu_h1Ac49R5OfYLTl26OJW-t_S8AcBCSiV9rrGR8 |
CODEN | ITCTEM |
ContentType | Journal Article |
DBID | 97E RIA RIE AAYXX CITATION |
DOI | 10.1109/TCSVT.2025.3597071 |
DatabaseName | IEEE All-Society Periodicals Package (ASPP) 2005–Present IEEE All-Society Periodicals Package (ASPP) 1998–Present IEEE Electronic Library (IEL) CrossRef |
DatabaseTitle | CrossRef |
DatabaseTitleList | |
Database_xml | – sequence: 1 dbid: RIE name: IEEE Electronic Library (IEL) url: https://proxy.k.utb.cz/login?url=https://ieeexplore.ieee.org/ sourceTypes: Publisher |
DeliveryMethod | fulltext_linktorsrc |
Discipline | Engineering |
EISSN | 1558-2205 |
EndPage | 1 |
ExternalDocumentID | 10_1109_TCSVT_2025_3597071 11121350 |
Genre | orig-research |
GrantInformation_xml | – fundername: Science and Technology Project of Hainan Provincial Department of Transportation grantid: HNJTT-KXC-2024-3-22-02 – fundername: National Natural Science Foundation of China grantid: 62206184; 62272018 funderid: 10.13039/501100001809 |
GroupedDBID | -~X 0R~ 29I 4.4 5GY 6IK 97E AAJGR AARMG AASAJ AAWTH ABAZT ABQJQ ABVLG ACGFO ACGFS ACIWK AENEX AGQYO AHBIQ AKJIK AKQYR ALMA_UNASSIGNED_HOLDINGS ASUFR ATWAV BEFXN BFFAM BGNUA BKEBE BPEOZ CS3 DU5 EBS HZ~ IFIPE IPLJI JAVBF LAI M43 O9- OCL P2P RIA RIE RNS RXW TAE TN5 5VS AAYXX AETIX AGSQL AI. AIBXA ALLEH CITATION EJD H~9 ICLAB IFJZH RIG VH1 |
ID | FETCH-LOGICAL-c136t-a3f0b4f97d7238f62b30b7da57c1576b9ec2d10b7c793f72d7ea3177570cdd683 |
IEDL.DBID | RIE |
ISSN | 1051-8215 |
IngestDate | Thu Aug 14 00:18:40 EDT 2025 Wed Aug 27 01:43:18 EDT 2025 |
IsPeerReviewed | true |
IsScholarly | true |
Language | English |
License | https://ieeexplore.ieee.org/Xplorehelp/downloads/license-information/IEEE.html https://doi.org/10.15223/policy-029 https://doi.org/10.15223/policy-037 |
LinkModel | DirectLink |
MergedId | FETCHMERGED-LOGICAL-c136t-a3f0b4f97d7238f62b30b7da57c1576b9ec2d10b7c793f72d7ea3177570cdd683 |
ORCID | 0000-0002-1716-4381 0000-0002-0046-5794 0009-0000-5396-0185 0000-0002-5825-7517 0000-0002-8024-347X |
PageCount | 1 |
ParticipantIDs | crossref_primary_10_1109_TCSVT_2025_3597071 ieee_primary_11121350 |
PublicationCentury | 2000 |
PublicationDate | 2025-00-00 |
PublicationDateYYYYMMDD | 2025-01-01 |
PublicationDate_xml | – year: 2025 text: 2025-00-00 |
PublicationDecade | 2020 |
PublicationTitle | IEEE transactions on circuits and systems for video technology |
PublicationTitleAbbrev | TCSVT |
PublicationYear | 2025 |
Publisher | IEEE |
Publisher_xml | – name: IEEE |
SSID | ssj0014847 |
Score | 2.4562712 |
Snippet | Out-of-distribution (OoD) semantic segmentation aims to recognize pixels of classes undefined in the training dataset. Existing methods mostly focus on... |
SourceID | crossref ieee |
SourceType | Index Database Publisher |
StartPage | 1 |
SubjectTerms | Accuracy Data models disentangled representations Image reconstruction out-of-distribution Overfitting Semantic segmentation Shape Synthetic data Training Training data Uncertainty |
Title | Out-of-Distribution Semantic Segmentation with Disentangled and Calibrated Representation |
URI | https://ieeexplore.ieee.org/document/11121350 |
hasFullText | 1 |
inHoldings | 1 |
isFullTextHit | |
isPrint | |
link | http://utb.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwjV3dS8MwEA-6J33wc-L8og--SbqmX2kfZTpEcILbZD6VJrkM0XUi7Yt_vZe01SkIvoXjSsNdwn3kfneEnEMoPMmBUzQPKQ1lpGnqM03R2CRahl6e2w58d6P4ZhrezqJZA1a3WBgAsMVn4JqlfctXS1mZVFkf76XPAhOhr2PkVoO1vp4MwsROE0N_geGfWNQiZLy0PxmMHycYC_qRG6AD7XH2wwqtjFWxVmW4TUbtfupikhe3KoUrP361avz3hnfIVuNfOpf1gdgla1Dskc2VroP75Om-KulS0yvTM7cZd-WMYYEyfpa4mC8aPFLhmCytg3yGUMxfQTl5oRyD5xKmxYRyHmwhbcvfJdPh9WRwQ5sJC1SyIC5pHmhPhDrlyswe07EvAk9wlUdcMgxERArSVwxJEq-x5r7ikKPDwSPuSaXiJDggnWJZwCFxAFJkYgoCg87VLOdxGkcy1mGSKy2gRy5aiWdvdSONzAYgXppZ_WRGP1mjnx7pGml-czaCPPqDfkw2zOd1buSEdMr3Ck7RWyjFmT0lnwtyvNg |
linkProvider | IEEE |
linkToHtml | http://utb.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwzV1LTxUxFD5BXCgLfGFAfHShK9NL25lO7yxcGJBc5GEiF4OrcdqeEgLMJTI3RP-Lf8Xf5mlnLqKJSxJ3zclJk_Z8zXn0PABeYm6FM2g4qYeS504HXioZOCmbYXC5qOvUgW93rxgd5O8P9eEc_LiqhUHElHyGg7hMf_l-4qYxVLZG71LJTIs-h3Ibv12Sh3bxZmuDxPlKqc134_UR74cIcCezouV1FoTNQ2l8HK8VCmUzYY2vtXGSbG1bolNeEskRUoNR3mBNOtVoI5z3xTCjfW_BbTI0tOrKw64-KfJhml9GFoqks0k9q8kR5dp4ff_TmLxPpQcZmezCyD_03rVBLkmPbd6Dn7Mb6NJXTgbT1g7c97-aQ_63V3QfFnsLmr3tIP8A5rB5CAvX-io-gs8fpi2fBL4RuwL3A73YPp4Rio4dLY7O-oqrhsU4NCO-SGiOTtGzuvEsVqzZ2ETDs48pVXjGvwQHN3K2xzDfTBpcBoZYEpP0mMX64yBrU5SFdkXIh7UPFlfg9UzC1XnXKqRKLpYoq4SHKuKh6vGwAktRer85e8E9-Qf9BdwZjXd3qp2tve1VuBu36iJBT2G-_TrFZ2QbtfZ5QiiDLzct719GURow |
openUrl | ctx_ver=Z39.88-2004&ctx_enc=info%3Aofi%2Fenc%3AUTF-8&rfr_id=info%3Asid%2Fsummon.serialssolutions.com&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Ajournal&rft.genre=article&rft.atitle=Out-of-Distribution+Semantic+Segmentation+with+Disentangled+and+Calibrated+Representation&rft.jtitle=IEEE+transactions+on+circuits+and+systems+for+video+technology&rft.au=Wan%2C+Maoxian&rft.au=Li%2C+Kaige&rft.au=Geng%2C+Qichuan&rft.au=Su%2C+Binyi&rft.date=2025&rft.issn=1051-8215&rft.eissn=1558-2205&rft.spage=1&rft.epage=1&rft_id=info:doi/10.1109%2FTCSVT.2025.3597071&rft.externalDBID=n%2Fa&rft.externalDocID=10_1109_TCSVT_2025_3597071 |
thumbnail_l | http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/lc.gif&issn=1051-8215&client=summon |
thumbnail_m | http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/mc.gif&issn=1051-8215&client=summon |
thumbnail_s | http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/sc.gif&issn=1051-8215&client=summon |