Make Segment Anything Model Perfect on Shadow Detection

Compared to models pretrained on ImageNet, the segment anything model (SAM) has been trained on a massive segmentation corpus, excelling in both generalization ability and boundary localization. However, these strengths are still insufficient to enhance shadow detection without additional training,...

Full description

Saved in:
Bibliographic Details
Published inIEEE transactions on geoscience and remote sensing Vol. 61; pp. 1 - 13
Main Authors Chen, Xiao-Diao, Wu, Wen, Yang, Wenya, Qin, Hongshuai, Wu, Xiantao, Mao, Xiaoyang
Format Journal Article
LanguageEnglish
Published New York IEEE 2023
The Institute of Electrical and Electronics Engineers, Inc. (IEEE)
Subjects
Online AccessGet full text

Cover

Loading…
More Information
Summary:Compared to models pretrained on ImageNet, the segment anything model (SAM) has been trained on a massive segmentation corpus, excelling in both generalization ability and boundary localization. However, these strengths are still insufficient to enhance shadow detection without additional training, and it raises the question: do we still need precise manual annotations to fine-tune SAM for high detection accuracy? This article proposes an annotation-free framework for deep unsupervised shadow detection (USD) by leveraging SAM's capabilities. The key lies in how to exploit the abilities acquired from a large-scale corpus and utilize them to improve downstream tasks. Instead of directly fine-tuning SAM, we propose a prompt-like tuning method to inject task-specific cues into SAM in a lightweight manner, namely, ShadowSAM. This adaptation manner can ensure a good fitting when training data are limited. Moreover, considering that the pseudo labels used in our framework are generated by traditional USD approaches and may contain severe label noises, we propose an illumination and texture-guided updating (ITU) strategy to selectively boost the quality of pseudo masks. To further improve the model's robustness, we design a mask diversity index (MDI) to establish easy-to-hard subsets for incremental curriculum learning. Extensive experiments on benchmark datasets (i.e., SBU, UCF, ISTD, and CUHK-Shadow) demonstrate that our unsupervised solution can achieve comparable performance to state-of-the-art (SOTA) fully supervised methods. Our code is available at this repository.
ISSN:0196-2892
1558-0644
DOI:10.1109/TGRS.2023.3332257