Enhancing meteorological data reliability: An explainable deep learning method for anomaly detection

Accurate meteorological observation data is of great importance to human production activities. Meteorological observation systems have been advancing toward automation, intelligence, and informatization. Yet, instrumental malfunctions and unstable sensor node resources could cause significant devia...

Full description

Saved in:
Bibliographic Details
Published inJournal of environmental management Vol. 374; p. 124011
Main Authors Qu, Zhongke, Xiao, Ruizhi, Yang, Ke, Li, Mingjuan, Hu, Xinyu, Liu, Zhichao, Luo, Xilian, Gu, Zhaolin, Li, Chengwei
Format Journal Article
LanguageEnglish
Published England Elsevier Ltd 01.02.2025
Subjects
Online AccessGet full text

Cover

Loading…
More Information
Summary:Accurate meteorological observation data is of great importance to human production activities. Meteorological observation systems have been advancing toward automation, intelligence, and informatization. Yet, instrumental malfunctions and unstable sensor node resources could cause significant deviations of data from the actual characteristics it should reflect. To achieve greater data accuracy, early detections of data anomalies, continuous collections and timely transmissions of data are essential. While obvious anomalies can be readily identified, the detection of systematic and gradually emerging anomalies requires further analyses. This study develops an interpretable deep learning method based on an autoencoder (AE), SHapley Additive exPlanations (SHAP) and Bayesian optimization (BO), in order to facilitate prompt and accurate anomaly detections of meteorological observational data. The proposed method can be unfolded into four parts. Firstly, the AE performs anomaly detections based on multidimensional meteorological datasets by marking the data that shows significant reconstruction errors. Secondly, the model evaluates the importance of each meteorological element of the flagged data via SHapley Additive exPlanation (SHAP). Thirdly, a K-sigma based threshold automatic delineation method is employed to obtain reasonable anomaly thresholds that are subject to the data characteristics of different observation sites. Finally, the BO algorithm is adopted to fine-tune difficult hyperparameters, enhancing the model's structure and thus the accuracy of anomaly detection. The practical implication of the proposed model is to inform agricultural production, climate observation, and disaster prevention. •The study pioneers the anomaly detection within multidimensional meteorological datasets, rather than only limited to the detection of outliers of the single meteorological element.•Combines autoencoders (AE) and SHapley Additive exPlanations (SHAP) to enhance the interpretability of models in meteorological anomaly detection, making it possible to understand which features contribute most to anomalies.•Introduces a novel K-sigma based method for automatically setting anomaly thresholds, which dynamically adjusts to the specific characteristics of different observation sites, overcoming the limitations of manual threshold settings.•Employs Bayesian optimization to fine-tune critical hyperparameters of the model, ensuring the selection of optimal parameters for improved accuracy and generalizability.•The proposed method achieves superior accuracy , outperforming baseline models such as Peaks-over-Threshold, One Class SVM, Isolation Forest, and Local Outlier Factor.
Bibliography:ObjectType-Article-1
SourceType-Scholarly Journals-1
ObjectType-Feature-2
content type line 23
ISSN:0301-4797
1095-8630
1095-8630
DOI:10.1016/j.jenvman.2024.124011