Spatiotemporal Fusion Network for Land Surface Temperature Based on a Conditional Variational Autoencoder

High spatiotemporal resolution land surface temperature (LST) data are essential for dynamic monitoring and prediction in climate change research. Due to the limitations of remote sensing instruments, the current platforms have difficulty in achieving a compromise between high spatial and temporal r...

Full description

Saved in:

Bibliographic Details
Published in	IEEE transactions on geoscience and remote sensing Vol. 60; pp. 1 - 13
Main Authors	Chen, Yuncheng, Yang, Yingbao, Pan, Xin, Meng, Xiangjin, Hu, Jia
Format	Journal Article
Language	English
Published	New York IEEE 2022 The Institute of Electrical and Electronics Engineers, Inc. (IEEE)
Subjects	Biological system modeling Climate change Climate change research Climate prediction Coders Conditional variational autoencoder (CVAE) convolutional neural network (CNN) Data models Deep learning Heterogeneity Image reconstruction Instruments Land surface temperature land surface temperature (LST) Noise reduction Outliers (statistics) pretraining Process parameters Remote sensing River basins Root-mean-square errors Spatial resolution spatiotemporal fusion Spatiotemporal phenomena Surface temperature Temperature sensors Training Weighting Work platforms
Online Access	Get full text

Cover

Loading…

More Information
Summary:	High spatiotemporal resolution land surface temperature (LST) data are essential for dynamic monitoring and prediction in climate change research. Due to the limitations of remote sensing instruments, the current platforms have difficulty in achieving a compromise between high spatial and temporal resolutions for LST products. In this study, we propose a spatiotemporal fusion network for LST based on a conditional variational autoencoder (CVAE-LSTFM). First, an improved network is designed based on the CVAE by reconstructing an encoder and a decoder. To generate fine LST images based on dense time series, a variational inference model is formulated to integrate coarse and fine LST image pairs in variational autoencoded latent space. In addition, a new compound loss function for the proposed training method is designed to reduce the effects of noise and outliers. Then, a pretraining mechanism is adopted to optimize the network training process, and the parameters can be transferred to the training network of the CVAE to accelerate network convergence. Finally, a novel weighting strategy that considers spatiotemporal variations in LST (LST consistency weighting) is employed to solve the spatiotemporal heterogeneity problem caused by the rapid changes in LST. The method is quantitatively tested and evaluated in the Heihe River Basin using FY-4A LST and MODIS LST from September 2019. Compared with two traditional models and two deep learning-based models, CVAE-LSTFM yields lower root-mean-square error (RMSE) (average < 1.26 K) and learning perceptual image patch similarity (LPIPS, average < 0.13) values and higher structural similarity (SSIM, average <inline-formula> <tex-math notation="LaTeX">>0.96 </tex-math></inline-formula>). In practice, CVAE-LSTFM can generate high spatiotemporal resolution LST values (hourly LST with a 1-km spatial resolution) with high accuracy, quality, and robustness.
ISSN:	0196-2892 1558-0644
DOI:	10.1109/TGRS.2022.3183114