cloudy_banner

SEN12MS-CR-TS DATASET

A Dataset and Remote Sensing Benchmark
for Multimodal & Multitemporal Cloud Removal





Introduction

On average, over half of all optical observations acquired via spaceborne Earth imagery are affected by clouds. As cloud coverage severely impedes the ongoing observation of Earth, the automated reconstruction of noisy or cloud covered information is a persistent problem in signal processing and remote sensing. While classical remote sensing applications oftentimes focused on narrowly-defined areals and case studies, the increasing availability of public-access, daily and large-scale satellite monitoring shifted the community's interest towards globally applicable methodology. To support the development of modern machine learning techniques for the purpose of cloud removal on whole-planet satellite data, we curated a large data set for training and evaluating new approaches.


This data set, SEN12MS-CR-TS, is a multi-modal and multi-temporal data set for cloud removal. It contains paired and co-registered sequences of space-borne radar measurements practically unaffected by clouds, as well as cloud-covered and cloud-free multi-spectral optical satellite observations. Complementary to the radar modality's cloud-robust information, historical satellite data is collected to provide additional sensor information for more faithfully reconstructing cloud-covered pixels. The radar and optical data are collected via Sentinel-1 and Sentinel-2 satellites from European Space Agency's Copernicus mission, respectively. The Sentinel satellites provide public access data and are among the most prominent satellites in Earth observation.


Exemplary data samples. Every time point of an observation in SEN12MS-CR-TS is a tuple consisting of co-registered 2-bands Sentinel-1 radar measurements, as well as 13-bands Sentinel-2 optical observations. Automatically annotated cloud masks are shown below. There are 30 repeated measures throughout the year of 2018 for each ROI in the data set.

SEN12MS-CR-TS is the first public data set for cloud removal in Earth observation to provide repeates measures of multi-sensor data at a large-scale global and all-season coverage. Based on the observation that cloud coverage varies widely in practice, all scenarios ranging from clear skies to absolute coverage are contained in the train and test data. By making this curated and readily pre-processed data set available to the research community, we hope to hope to help advance automated cloud removal in optical satellite data.


Exemplary sequence-to-sequence cloud removal results. Left to right: Radar time series, (cloud-covered) optical time series and cloud-removed optical predictions.



Statistics

SEN12MS-CR-TS is a global data set for multi-modal and multi-temporal cloud removal. It contains observations covering 53 globally distributed Regions of Interest, backward-compatible with the SEN12MS-CR data set. For each region, 30 temporally aligned synthetic aperture radar Sentinel-1 as well as optical multi-spectral Sentinel-2 images are collected throughout the whole year of 2018. 30 time samples for each of the 15578 patch-wise observations, for every S1 and S2 measurement. The approximate cloud coverage of all data is about 50%---reaching from clear-view images (e.g. for validation purposes), over semi-transparent haze or small clouds to dense and wide cloud coverage. The geospatial locations of ROI and our train and test split definitions are compatible with the mono-temporal SEN12MS-CR data set for cloud removal, so both data sets can complement one another and be combined for developing and benchmarking cloud removal solutions.

Geospatial distribution of ROI in SEN12MS-CR-TS. Train split (blue), test split (violet), complementary SEN12MS-CR-TS ROI (gray and black).




Benchmarking

Method RMSE ↓ SAM ↓ PSNR ↑ SSIM ↑
least cloudy 0.079 12.204 --- 0.815
mosaicing 0.062 14.324 31.68 0.811
DSen2-CR (Meraner et al., 2020) 0.060 12.147 26.04 0.810
STGAN (Sarukkai et al., 2020) 0.057 12.548 25.42 0.818
CR-TS Net (Ebel et al., 2022) 0.051 10.657 26.68 0.836
U-TAE (Garnot et al., 2021) 0.051 11.649 27.05 0.849
UnCRtainTS L2 (Ebel et al., 2023) 0.049 10.168 27.23 0.859
UnCRtainTS σ (Ebel et al., 2023) 0.051 10.160 27.84 0.866
SeqDMs (Zhao & Jia, 2023) 0.045 12.777 28.07 0.827
UnCRtainTS ensemble (Ebel et al., 2023) 0.049 10.18 28.19 0.872

Note that ensemble methods should be considered a category on their own, for the sake of a fair comparison.

Feel free to report this benchmarking of prior methods in your work utilizing SEN12MS-CR-TS.
Please reach out to us if you would like to have your work referenced and your model included in the benchmark.





Download

1. Dataset

Download Link here (train split) and here (test split)

Note: You can also download (parts of) the data in the terminal (passwd: m1639953 or m1659251) using wget or rsync, for instance via
wget "ftp://m1639953:m1639953@dataserv.ub.tum.de/s1_africa.tar.gz" ,
rsync -chavzP --stats rsync://m1639953@dataserv.ub.tum.de/m1639953/ . or, likewise, for the test split
rsync -chavzP --stats rsync://m1659251@dataserv.ub.tum.de/m1659251/ .

Update: You can now easily get the dataset via this automated downloading script here.

2. Code

Download Link here

3. Trained Models

Cloud Removal Models

Model Weights Download Description
CR-TS Net (t=3) pCloud Share [PyTorch] The network trained on SEN12MS-CR-TS in Ebel et al 2022 on t=3 temporal samples of paired radar and cloudy optical satellite observations.
CR-TS Net (t=4) pCloud Share [PyTorch] The network trained on SEN12MS-CR-TS in Ebel et al 2022 on t=4 temporal samples of paired radar and cloudy optical satellite observations.
CR-TS Net (t=5) pCloud Share [PyTorch] The network trained on SEN12MS-CR-TS in Ebel et al 2022 on t=5 temporal samples of paired radar and cloudy optical satellite observations.
UnCRtainTS pCloud Share [PyTorch] All varieties of the uncertainty prediction networks trained on SEN12MS-CR-TS in Ebel et al 2023 on paired radar and cloudy optical satellite observations. Code here.


Complementary Models

Model Weights Download Description
VGG16 pCloud Share [PyTorch] A VGG16 trained on 13-bands Sentinel-2 samples for global scene and land cover type classification, trained on SEN12MS. Can be used for perceptual losses, style transfer, multi-spectral Fréchet inception distance and other metrics, or for other purposes.
ResNet pCloud Share [PyTorch] A ResNet16 pre-trained for mono-temporal cloud removal on SEN12MS-CR. Takes radar and multi-spectral satellite data as inputs to make cloud-removed multispectral optical predictions.





References

1. Citation

    If you utilize this dataset in your work, please use the following citation:

@article{sen12mscrts,
        title = {{SEN12MS-CR-TS: A Remote Sensing Data Set for Multi-modal Multi-temporal Cloud Removal}},
        author = {Ebel, Patrick and Xu, Yajin and Schmitt, Michael and Zhu, Xiao Xiang},
        journal = {IEEE Transactions on Geoscience and Remote Sensing},
        year = {2022}
        publisher = {IEEE}
}






© Patrick Ebel www.pwjebel.com.