Uncertainty-aware surrogate modeling for urban air pollutant dispersion prediction

This study evaluates a surrogate modeling approach that provides rapid ensemble predictions of air pollutant dispersion in urban environments for varying meteorological forcing, while estimating irreducible and modeling uncertainties. The POD–GPR approach combining Proper Orthogonal Decomposition (P...

Full description

Saved in:
Bibliographic Details
Published inBuilding and environment Vol. 267; p. 112287
Main Authors Lumet, Eliott, Rochoux, Mélanie C., Jaravel, Thomas, Lacroix, Simon
Format Journal Article
LanguageEnglish
Published Elsevier Ltd 01.01.2025
Elsevier
Subjects
Online AccessGet full text
ISSN0360-1323
DOI10.1016/j.buildenv.2024.112287

Cover

Loading…
More Information
Summary:This study evaluates a surrogate modeling approach that provides rapid ensemble predictions of air pollutant dispersion in urban environments for varying meteorological forcing, while estimating irreducible and modeling uncertainties. The POD–GPR approach combining Proper Orthogonal Decomposition (POD) and Gaussian Process Regression (GPR) is applied to emulate the response surface of a Large-Eddy Simulation (LES) model of the Mock Urban Setting Test (MUST) field-scale experiment. We design and validate new methods for (i) selecting the POD-latent space dimension to avoid overfitting noisy structures due to atmospheric internal variability, and (ii) estimating the uncertainty in POD–GPR predictions. To train and validate the POD–GPR surrogate in an offline phase, we build a large dataset of 200 LES 3-D time-averaged concentration fields, which are subject to substantial spatial variability from near-source to background concentration and have a very large dimension of several million grid cells. The results show that POD–GPR reaches the best achievable accuracy levels, except for the highest concentration near the source, while predicting full fields at a computational cost five orders of magnitude lower than an LES. The results also show that the proposed mode selection criterion avoids perturbing the surrogate response surface, and that the uncertainty estimate explains a large part of the surrogate error and is spatially consistent with the observed internal variability. Finally, POD–GPR can be robustly trained with much smaller datasets, paving the way for application to realistic urban configurations. •A large dataset of 200 large-eddy simulations is built to train a surrogate model.•The surrogate reaches near-optimal mean concentration prediction accuracy.•The prediction computational cost is reduced by five orders of magnitude.•The surrogate can learn the internal variability of large-eddy simulation statistics.•Uncertainty estimates are realistic and explain surrogate errors well.
ISSN:0360-1323
DOI:10.1016/j.buildenv.2024.112287