From Labquakes to Megathrusts: Scaling Deep Learning Based Pickers Over 15 Orders of Magnitude

The application of machine learning techniques in seismology has greatly advanced seismological analysis, especially for earthquake detection and seismic phase picking. However, machine learning approaches still face challenges in generalizing to data sets that differ from their original training se...

Full description

Saved in:
Bibliographic Details
Published inJournal of geophysical research. Machine learning and computation Vol. 1; no. 4
Main Authors Shi, Peidong, Meier, Men‐Andrin, Villiger, Linus, Tuinstra, Katinka, Selvadurai, Paul Antony, Lanza, Federica, Yuan, Sanyi, Obermann, Anne, Mesimeri, Maria, Münchmeyer, Jannes, Bianchi, Patrick, Wiemer, Stefan
Format Journal Article
LanguageEnglish
Published American Geophysical Union/Wiley 01.12.2024
Wiley
Subjects
Online AccessGet full text

Cover

Loading…
More Information
Summary:The application of machine learning techniques in seismology has greatly advanced seismological analysis, especially for earthquake detection and seismic phase picking. However, machine learning approaches still face challenges in generalizing to data sets that differ from their original training setting. Previous studies focused on retraining or transfer‐learning models for these scenarios, but require high‐quality labeled data sets. This paper demonstrates a new approach for augmenting already trained models without the need for additional training data. We propose four strategies—rescaling, model aggregation, shifting, and filtering—to enhance the performance of pre‐trained models on out‐of‐distribution data sets. We further devise various methodologies to ensemble the individual predictions from these strategies to obtain a final unified prediction result featuring prediction robustness and detection sensitivity. We develop an open‐source Python module quakephase that implements these methods and can flexibly process input continuous seismic data of any sampling rate. With quakephase and pre‐trained ML models from SeisBench, we perform systematic benchmark tests on data recorded by different types of instruments, ranging from acoustic emission sensors to distributed acoustic sensing, and collected at different scales, spanning from laboratory acoustic emission events to major tectonic earthquakes. Our tests highlight that rescaling is essential for dealing with small‐magnitude seismic events recorded at high sampling rates as well as larger magnitude events having long coda and remote events with long wave trains. Our results demonstrate that the proposed methods are effective in augmenting pre‐trained models for out‐of‐distribution data sets, especially in scenarios with limited labeled data for transfer learning. Plain Language Summary Machine learning has revolutionized earthquake detection and arrival time picking, relying on vast amounts of accurately labeled data for model development and training. However, when faced with new, unique data sets, the lack of labeled information poses a significant challenge. In this study, we introduce a method to enhance the performance of pre‐trained machine learning models on such exotic data sets, even in the absence of labeled data. Our approach does not involve creating new models; instead, it focuses on enhancing and aggregating existing pre‐trained models to tackle the quandary of missing labeled data. Our comprehensive benchmark tests underline that machine learning models, initially trained for tectonic earthquakes, can be effectively repurposed to analyze events from labquakes and tiny induced earthquakes to megathrusts recorded by various instruments and at various sampling frequencies. Key Points We propose a workflow to enhance the performance of pre‐trained seismic picking models on out‐of‐distribution data sets without retraining Data rescaling and prediction ensembling can strongly augment pre‐trained seismic phase‐picking models Rescaling makes seismic phase picking models trained on local seismicity directly applicable to quakes spanning over 15 orders of magnitude
ISSN:2993-5210
2993-5210
DOI:10.1029/2024JH000220