RingMo-SAM: A Foundation Model for Segment Anything in Multimodal Remote-Sensing Images
The proposal of the segment anything model (SAM) has created a new paradigm for the deep-learning-based semantic segmentation field and has shown amazing generalization performance. However, we find it may fail or perform poorly on multimodal remote-sensing scenarios, especially synthetic aperture r...
Saved in:
Published in | IEEE transactions on geoscience and remote sensing Vol. 61; pp. 1 - 16 |
---|---|
Main Authors | , , , , , , , , |
Format | Journal Article |
Language | English |
Published |
New York
IEEE
2023
The Institute of Electrical and Electronics Engineers, Inc. (IEEE) |
Subjects | |
Online Access | Get full text |
Cover
Loading…
Summary: | The proposal of the segment anything model (SAM) has created a new paradigm for the deep-learning-based semantic segmentation field and has shown amazing generalization performance. However, we find it may fail or perform poorly on multimodal remote-sensing scenarios, especially synthetic aperture radar (SAR) images. Besides, SAM does not provide category information for objects. In this article, we propose a foundation model for multimodal remote-sensing image segmentation called RingMo-SAM, which can not only segment anything in optical and SAR remote-sensing data, but also identify object categories. First, a large-scale dataset containing millions of segmentation instances is constructed by collecting multiple open-source datasets in this field to train the model. Then, by constructing an instance-type and terrain-type category-decoupling mask decoder (CDMDecoder), the categorywise segmentation of various objects is achieved. In addition, a prompt encoder embedded with the characteristics of multimodal remote-sensing data is designed. It not only supports multibox prompts to improve the segmentation accuracy of multiobjects in complicated remote-sensing scenes, but also supports SAR characteristics prompts to improve the segmentation performance on SAR images. Extensive experimental results on several datasets including iSAID, ISPRS Vaihingen, ISPRS Potsdam, AIR-PolSAR-Seg, and so on have demonstrated the effectiveness of our method. |
---|---|
Bibliography: | ObjectType-Article-1 SourceType-Scholarly Journals-1 ObjectType-Feature-2 content type line 14 |
ISSN: | 0196-2892 1558-0644 |
DOI: | 10.1109/TGRS.2023.3332219 |