A Framework for the Cross‐Validation of Categorical Geostatistical Simulations

The mapping of subsurface parameters and the quantification of spatial uncertainty requires selecting adequate models and their parameters. Cross‐validation techniques have been widely used for geostatistical model selection for continuous variables, but the situation is different for categorical va...

Full description

Saved in:

Bibliographic Details
Published in	Earth and space science (Hoboken, N.J.) Vol. 7; no. 8
Main Authors	Juda, Przemysław, Renard, Philippe, Straubhaar, Julien
Format	Journal Article
Language	English
Published	Hoboken John Wiley & Sons, Inc 01.08.2020 American Geophysical Union (AGU)
Subjects	Alluvial aquifers categorical variable Coastal aquifers Datasets Expected values geostatistics Methods model testing Parameter identification Simulation
Online Access	Get full text

Cover

Loading…

More Information
Summary:	The mapping of subsurface parameters and the quantification of spatial uncertainty requires selecting adequate models and their parameters. Cross‐validation techniques have been widely used for geostatistical model selection for continuous variables, but the situation is different for categorical variables. In these cases, cross‐validation is seldom applied, and there is no clear consensus on which method to employ. Therefore, this paper proposes a systematic framework for the cross‐validation of geostatistical simulations of categorical variables such as geological facies. The method is based on K‐fold cross‐validation combined with a proper scoring rule. It can be applied whenever an observation data set is available. At each cross‐validation iteration, the training set becomes conditioning data for the tested geostatistical model, and the ensemble of simulations is compared to true values. The proposed framework is generic. Its application is illustrated with two examples using multiple‐point statistics simulations. In the first test case, the aim is to identify a training image from a given data set. In the second test case, the aim is to identify the parameters in a situation including nonstationarity for a coastal alluvial aquifer in the south of France. Cross‐validation scores are used as metrics of model performance and quadratic scoring rule, zero‐one score, and balanced linear score are compared. The study shows that the proposed fivefold stratified cross‐validation with the quadratic scoring rule allows ranking the geostatistical models and helps to identify the proper parameters. Key Points Cross‐validation framework is developed for testing simulations of categorical variables The methodology is generic, and competing models are based on a single score It requires a set of observation points and can be used with any spatial simulation method
ISSN:	2333-5084 2333-5084
DOI:	10.1029/2020EA001152