HRDA: Context-Aware High-Resolution Domain-Adaptive Semantic Segmentation

Unsupervised domain adaptation (UDA) aims to adapt a model trained on the source domain (e.g. synthetic data) to the target domain (e.g. real-world data) without requiring further annotations on the target domain. This work focuses on UDA for semantic segmentation as real-world pixel-wise annotation...

Full description

Saved in:

Bibliographic Details
Published in	Computer Vision - ECCV 2022 Vol. 13690; pp. 372 - 391
Main Authors	Hoyer, Lukas, Dai, Dengxin, Van Gool, Luc
Format	Book Chapter
Language	English
Published	Switzerland Springer 01.01.2022 Springer Nature Switzerland
Series	Lecture Notes in Computer Science
Subjects	Attention High-resolution Multi-resolution Semantic segmentation Unsupervised domain adaptation
Online Access	Get full text

Cover

Loading…

More Information
Summary:	Unsupervised domain adaptation (UDA) aims to adapt a model trained on the source domain (e.g. synthetic data) to the target domain (e.g. real-world data) without requiring further annotations on the target domain. This work focuses on UDA for semantic segmentation as real-world pixel-wise annotations are particularly expensive to acquire. As UDA methods for semantic segmentation are usually GPU memory intensive, most previous methods operate only on downscaled images. We question this design as low-resolution predictions often fail to preserve fine details. The alternative of training with random crops of high-resolution images alleviates this problem but falls short in capturing long-range, domain-robust context information. Therefore, we propose HRDA, a multi-resolution training approach for UDA, that combines the strengths of small high-resolution crops to preserve fine segmentation details and large low-resolution crops to capture long-range context dependencies with a learned scale attention, while maintaining a manageable GPU memory footprint. HRDA enables adapting small objects and preserving fine segmentation details. It significantly improves the state-of-the-art performance by 5.5 mIoU for GTA→ $$\rightarrow $$ Cityscapes and 4.9 mIoU for Synthia→ $$\rightarrow $$ Cityscapes, resulting in unprecedented 73.8 and 65.8 mIoU, respectively. The implementation is available at github.com/lhoyer/HRDA.
Bibliography:	Original Abstract: Unsupervised domain adaptation (UDA) aims to adapt a model trained on the source domain (e.g. synthetic data) to the target domain (e.g. real-world data) without requiring further annotations on the target domain. This work focuses on UDA for semantic segmentation as real-world pixel-wise annotations are particularly expensive to acquire. As UDA methods for semantic segmentation are usually GPU memory intensive, most previous methods operate only on downscaled images. We question this design as low-resolution predictions often fail to preserve fine details. The alternative of training with random crops of high-resolution images alleviates this problem but falls short in capturing long-range, domain-robust context information. Therefore, we propose HRDA, a multi-resolution training approach for UDA, that combines the strengths of small high-resolution crops to preserve fine segmentation details and large low-resolution crops to capture long-range context dependencies with a learned scale attention, while maintaining a manageable GPU memory footprint. HRDA enables adapting small objects and preserving fine segmentation details. It significantly improves the state-of-the-art performance by 5.5 mIoU for GTA→\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\rightarrow $$\end{document}Cityscapes and 4.9 mIoU for Synthia→\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\rightarrow $$\end{document}Cityscapes, resulting in unprecedented 73.8 and 65.8 mIoU, respectively. The implementation is available at github.com/lhoyer/HRDA. Supplementary InformationThe online version contains supplementary material available at https://doi.org/10.1007/978-3-031-20056-4_22.
ISBN:	9783031200557 3031200551
ISSN:	0302-9743 1611-3349
DOI:	10.1007/978-3-031-20056-4_22