Uncertainty-aware surrogate modeling for urban air pollutant dispersion prediction
This study evaluates a surrogate modeling approach that provides rapid ensemble predictions of air pollutant dispersion in urban environments for varying meteorological forcing, while estimating irreducible and modeling uncertainties. The POD–GPR approach combining Proper Orthogonal Decomposition (P...
Saved in:
Published in | Building and environment Vol. 267; p. 112287 |
---|---|
Main Authors | , , , |
Format | Journal Article |
Language | English |
Published |
Elsevier Ltd
01.01.2025
Elsevier |
Subjects | |
Online Access | Get full text |
ISSN | 0360-1323 |
DOI | 10.1016/j.buildenv.2024.112287 |
Cover
Loading…
Summary: | This study evaluates a surrogate modeling approach that provides rapid ensemble predictions of air pollutant dispersion in urban environments for varying meteorological forcing, while estimating irreducible and modeling uncertainties. The POD–GPR approach combining Proper Orthogonal Decomposition (POD) and Gaussian Process Regression (GPR) is applied to emulate the response surface of a Large-Eddy Simulation (LES) model of the Mock Urban Setting Test (MUST) field-scale experiment. We design and validate new methods for (i) selecting the POD-latent space dimension to avoid overfitting noisy structures due to atmospheric internal variability, and (ii) estimating the uncertainty in POD–GPR predictions. To train and validate the POD–GPR surrogate in an offline phase, we build a large dataset of 200 LES 3-D time-averaged concentration fields, which are subject to substantial spatial variability from near-source to background concentration and have a very large dimension of several million grid cells. The results show that POD–GPR reaches the best achievable accuracy levels, except for the highest concentration near the source, while predicting full fields at a computational cost five orders of magnitude lower than an LES. The results also show that the proposed mode selection criterion avoids perturbing the surrogate response surface, and that the uncertainty estimate explains a large part of the surrogate error and is spatially consistent with the observed internal variability. Finally, POD–GPR can be robustly trained with much smaller datasets, paving the way for application to realistic urban configurations.
•A large dataset of 200 large-eddy simulations is built to train a surrogate model.•The surrogate reaches near-optimal mean concentration prediction accuracy.•The prediction computational cost is reduced by five orders of magnitude.•The surrogate can learn the internal variability of large-eddy simulation statistics.•Uncertainty estimates are realistic and explain surrogate errors well. |
---|---|
ISSN: | 0360-1323 |
DOI: | 10.1016/j.buildenv.2024.112287 |