PM2.5 Concentration Prediction Based on Pollutant Pattern Recognition Using PCA-clustering Method and CS Algorithm Optimized SVR

Environmental issues, particularly air pollution, are a matter of concern for people all around the world. PM2.5 levels that are too high harm people’s physical and mental health. For government air pollution control, more accurate PM2.5 concentration predictions are critical. In this paper, we expl...

Full description

Saved in:
Bibliographic Details
Published inNature environment and pollution technology Vol. 21; no. 1; pp. 393 - 403
Main Authors Liu, Wei, Chen, Fuji, Chen, Yihui
Format Journal Article
LanguageEnglish
Published Technoscience Publications 01.03.2022
Subjects
Online AccessGet full text

Cover

Loading…
More Information
Summary:Environmental issues, particularly air pollution, are a matter of concern for people all around the world. PM2.5 levels that are too high harm people’s physical and mental health. For government air pollution control, more accurate PM2.5 concentration predictions are critical. In this paper, we explored the relationship between pollutants (PM10, SO2, NO2, O3, CO) and meteorological factors (atmospheric pressure, relative humidity, air temperature, wind speed, wind direction, cumulative precipitation) that affect the generation and transmission of PM2.5. To better predict the concentration of PM2.5, we innovatively combined principal component analysis (PCA) and clustering methods to extract pollutant variables and patterns as important PM2.5 concentration predictors of different models such as support vector regression (SVR), multivariate nonlinear regression (MNR), and artificial neural network (ANN). Compared to MNR and ANN models, SVR presented better prediction accuracy. Moreover, cuckoo search (CS), cross-validation (CV), and particle swarm optimization (PSO) algorithms were used to further optimize the parameters in the process of SVR. And to evaluate the above PM2.5 concentration prediction results, we introduced several evaluating indicators including root mean squared error (RMSE), mean absolute error (MAE), mean absolute percentage error (MAPE), and person correlation coefficient (R) between predicted and measured values. The obtained results confirmed that when the pollutant data was divided into three patterns, the best prediction accuracy was achieved by the CS-SVR model.
ISSN:2395-3454
0972-6268
2395-3454
DOI:10.46488/NEPT.2022.v21i01.047