Combining data assimilation and machine learning to emulate a dynamical model from sparse and noisy observations: A case study with the Lorenz 96 model

•We address the problem of dynamics emulation from sparse and noisy observations.•An algorithm combining data assimilation and machine learning is applied.•The approach is tested on the chaotic 40-variables Lorenz 96 model.•The output of the algorithm is a data-driven surrogate numerical model.•The...

Full description

Saved in:

Bibliographic Details
Published in	Journal of computational science Vol. 44; p. 101171
Main Authors	Brajard, Julien, Carrassi, Alberto, Bocquet, Marc, Bertino, Laurent
Format	Journal Article
Language	English
Published	Elsevier B.V 01.07.2020 Elsevier
Subjects	Data assimilation Dynamical model Emulator Geophysics Machine learning Observations Physics Dynamical model Emulator Observations Machine learning Data assimilation
Online Access	Get full text

Cover

Loading…

More Information
Summary:	•We address the problem of dynamics emulation from sparse and noisy observations.•An algorithm combining data assimilation and machine learning is applied.•The approach is tested on the chaotic 40-variables Lorenz 96 model.•The output of the algorithm is a data-driven surrogate numerical model.•The surrogate model is validated on both forecast skill and long-term properties. A novel method, based on the combination of data assimilation and machine learning is introduced. The new hybrid approach is designed for a two-fold scope: (i) emulating hidden, possibly chaotic, dynamics and (ii) predicting their future states. The method consists in applying iteratively a data assimilation step, here an ensemble Kalman filter, and a neural network. Data assimilation is used to optimally combine a surrogate model with sparse noisy data. The output analysis is spatially complete and is used as a training set by the neural network to update the surrogate model. The two steps are then repeated iteratively. Numerical experiments have been carried out using the chaotic 40-variables Lorenz 96 model, proving both convergence and statistical skill of the proposed hybrid approach. The surrogate model shows short-term forecast skill up to two Lyapunov times, the retrieval of positive Lyapunov exponents as well as the more energetic frequencies of the power density spectrum. The sensitivity of the method to critical setup parameters is also presented: the forecast skill decreases smoothly with increased observational noise but drops abruptly if less than half of the model domain is observed. The successful synergy between data assimilation and machine learning, proven here with a low-dimensional system, encourages further investigation of such hybrids with more sophisticated dynamics.
ISSN:	1877-7503 1877-7511
DOI:	10.1016/j.jocs.2020.101171