SSR: SAM is a Strong Regularizer for domain adaptive semantic segmentation
We introduced SSR, which utilizes SAM (segment-anything) as a strong regularizer during training, to greatly enhance the robustness of the image encoder for handling various domains. Specifically, given the fact that SAM is pre-trained with a large number of images over the internet, which cover a d...
Saved in:
Main Authors | , , , |
---|---|
Format | Journal Article |
Language | English |
Published |
26.01.2024
|
Subjects | |
Online Access | Get full text |
Cover
Loading…
Summary: | We introduced SSR, which utilizes SAM (segment-anything) as a strong
regularizer during training, to greatly enhance the robustness of the image
encoder for handling various domains. Specifically, given the fact that SAM is
pre-trained with a large number of images over the internet, which cover a
diverse variety of domains, the feature encoding extracted by the SAM is
obviously less dependent on specific domains when compared to the traditional
ImageNet pre-trained image encoder. Meanwhile, the ImageNet pre-trained image
encoder is still a mature choice of backbone for the semantic segmentation
task, especially when the SAM is category-irrelevant. As a result, our SSR
provides a simple yet highly effective design. It uses the ImageNet pre-trained
image encoder as the backbone, and the intermediate feature of each stage (ie
there are 4 stages in MiT-B5) is regularized by SAM during training. After
extensive experimentation on GTA5$\rightarrow$Cityscapes, our SSR significantly
improved performance over the baseline without introducing any extra inference
overhead. |
---|---|
DOI: | 10.48550/arxiv.2401.14686 |