SSL4EO-S12: A large-scale multimodal, multitemporal dataset for self-supervised learning in Earth observation [Software and Data Sets]
Self-supervised pretraining bears the potential to generate expressive representations from large-scale Earth observation (EO) data without human annotation. However, most existing pretraining in the field is based on ImageNet or medium-sized, labeled remote sensing (RS) datasets. In this article, w...
Saved in:
Published in | IEEE geoscience and remote sensing magazine Vol. 11; no. 3; pp. 98 - 106 |
---|---|
Main Authors | , , , , , |
Format | Journal Article |
Language | English |
Published |
IEEE
01.09.2023
|
Subjects | |
Online Access | Get full text |
Cover
Loading…
Summary: | Self-supervised pretraining bears the potential to generate expressive representations from large-scale Earth observation (EO) data without human annotation. However, most existing pretraining in the field is based on ImageNet or medium-sized, labeled remote sensing (RS) datasets. In this article, we share an unlabeled dataset Self-Supervised Learning for Earth Observation-Sentinel-1/2 ( SSL4EO - S12 ) to assemble a large-scale, global, multimodal, and multiseasonal corpus of satellite imagery. We demonstrate SSL4EO-S12 to succeed in self-supervised pretraining for a set of representative methods: momentum contrast (MoCo), self-distillation with no labels (DINO), masked autoencoders (MAE), and data2vec, and multiple downstream applications, including scene classification, semantic segmentation, and change detection. Our benchmark results prove the effectiveness of SSL4EO-S12 compared to existing datasets. The dataset, related source code, and pretrained models are available at https://github.com/zhu-xlab/SSL4EO-S12 . |
---|---|
ISSN: | 2473-2397 2168-6831 |
DOI: | 10.1109/MGRS.2023.3281651 |