A random forest partition model for predicting NO 2 concentrations from traffic flow and meteorological conditions

High concentrations of nitrogen dioxide in the air, particularly in heavily urbanised areas, have an adverse effect on many aspects of residents' health (short-term and long-term damage, unpleasant odour and other). A method is proposed for modelling atmospheric NO concentrations in a conurbati...

Full description

Saved in:
Bibliographic Details
Published inThe Science of the total environment Vol. 651; no. Pt 1; p. 475
Main Author Kamińska, Joanna A
Format Journal Article
LanguageEnglish
Published Netherlands 15.02.2019
Subjects
Online AccessGet full text

Cover

Loading…
More Information
Summary:High concentrations of nitrogen dioxide in the air, particularly in heavily urbanised areas, have an adverse effect on many aspects of residents' health (short-term and long-term damage, unpleasant odour and other). A method is proposed for modelling atmospheric NO concentrations in a conurbation, using a partition model M consisting of two separate models: M for lower concentration values and M for upper values. An advanced data mining technique, that of random forests, is used. This is a method based on machine learning, involving the simultaneous compilation of information from multiple random trees. Using the example of data recorded in Wrocław (Poland) in 2015-2017, an iterative method was applied to determine the boundary concentration y˜ for which the mean absolute deviation error for the partition model attained its lowest value. The resulting model had an R value of 0.82, compared with 0.60 for a classical random forest model. The importances of the variables in the model M , similarly as in the classical case, indicate that the greatest influence on NO concentrations comes from traffic flow, followed by meteorological factors, in particular the wind direction and speed. In the model M the importances of the variables are significantly different: while traffic flow still has the greatest impact, the effects of temperature and relative humidity are almost as great. This confirms the justifiability of constructing separate models for low and high pollution concentrations.
ISSN:1879-1026