Time series prediction of the chemical components of PM2.5 based on a deep learning model

Modeling-based prediction methods enable rapid, reagent-free air pollution detection based on inexpensive multi-source data than traditional chemical reaction-based detection methods in order to quickly understand the air pollution situation. In this study, a convolutional neural network (CNN) and l...

Full description

Saved in:
Bibliographic Details
Published inChemosphere (Oxford) Vol. 342; p. 140153
Main Authors Liu, Kai, Zhang, Yuanhang, He, Huan, Xiao, Hui, Wang, Siyuan, Zhang, Yuteng, Li, Huiming, Qian, Xin
Format Journal Article
LanguageEnglish
Published Elsevier Ltd 01.11.2023
Subjects
Online AccessGet full text

Cover

Loading…
More Information
Summary:Modeling-based prediction methods enable rapid, reagent-free air pollution detection based on inexpensive multi-source data than traditional chemical reaction-based detection methods in order to quickly understand the air pollution situation. In this study, a convolutional neural network (CNN) and long and short-term memory (LSTM) neural networks are integrated to create a CNN-LSTM time series prediction model to predict the concentration of PM2.5 and its chemical components (i.e., heavy metals, carbon component, and water-soluble ions) using meteorological data and air pollutants (PM2.5, SO2, NO2, CO, and O3). In the integrated CNN-LSTM model, the CNN uses convolutional and pooling layers to extract features from the data, whereas the powerful nonlinear mapping and learning capabilities of LSTM enable the time series prediction of air pollution. The experimental results showed that the CNN-LSTM exhibited good generalization ability in the prediction of As, Cd, Cr, Cu, Ni, and Zn, with a mean R2 above 0.9. Mean R2 predicted for PM2.5, Pb, Ti, EC, OC, SO42−, and NO3− ranged from 0.85 to 0.9. Shapley value showed that PM2.5, NO2, SO2, and CO had a greater influence on the predicted heavy metal results of the model. Regarding water-soluble ions, the predicted results were dominantly influenced by PM2.5, CO, and humidity. The prediction of the carbon fraction was affected mainly by the PM2.5 concentration. Additionally, several input variables for various components were eliminated without affecting the prediction accuracy of the model, with R2 between 0.70 and 0.84, thereby maximizing modeling efficiency and lowering operational costs. The fully trained model prediction results showed that most predicted components of PM2.5 were lower during January to March 2020 than those in 2018 and 2019. This study provides insight into improving the accuracy of modeling-based detection methods and promotes the development of integrated air pollution monitoring toward a more sustainable direction. [Display omitted] •CNN coupled with LSTM for atmospheric time series predictions.•Many elements had better simulation effects when atmospheric pollutants as inputs.•CNN-LSTM has superior prediction performance with more than 90% accuracy.•The Shapely value method identifies the key indicators for predicting pollutant.•Predicted pollutant contents were lower during COVID-19 outbreak than in 2019.
Bibliography:ObjectType-Article-1
SourceType-Scholarly Journals-1
ObjectType-Feature-2
content type line 23
ISSN:0045-6535
1879-1298
DOI:10.1016/j.chemosphere.2023.140153