Leveraging Machine Learning Approaches to Predict Organic Carbon Abundance in Mars‐Analog Hypersaline Lake Sediments
Modern advancements in laboratory and instrumental techniques in astrobiology have improved our life detection capabilities on both Earth and beyond. These advancements have also increased the complexity of data often resulting in data sets that are characterized by complex and non‐linear relationsh...
Saved in:
Published in | Journal of geophysical research. Machine learning and computation Vol. 1; no. 2 |
---|---|
Main Authors | , , , , , , |
Format | Journal Article |
Language | English |
Published |
01.06.2024
|
Subjects | |
Online Access | Get full text |
Cover
Loading…
Summary: | Modern advancements in laboratory and instrumental techniques in astrobiology have improved our life detection capabilities on both Earth and beyond. These advancements have also increased the complexity of data often resulting in data sets that are characterized by complex and non‐linear relationships. Machine learning methods are underutilized in astrobiology; however, these methods are extremely effective at revealing structure and patterns in complex data sets when paired with the right algorithms. Here, we employ a series of classification and regression algorithms to predict the abundance of organic carbon (OC) from X‐ray fluorescence (XRF) heavy element (>Mg) data in dynamic Mars‐analog hypersaline lake sediments. More specifically, we constructed models using the random forest, k‐nearest neighbors (KNN), support vector machine, and logistic regression algorithms. Overall, our trained models showed good performance with predicting the abundance of OC, with accuracies from 80% to 94%. Machine learning approaches such as classification and regression algorithms offer insight into complex data while providing agnostic insights, ultimately creating a more efficient search for OC. We applied our trained model on XRF data from Martian soil using rover‐based (PIXL) and orbital (Odyssey) data sets to produce probability predictions of OC abundance. Our predictions show a high probability that OC abundance is low which is comparable to OC data from recently landed missions. These results highlight the potential for predictive machine learning models to be trained on data from analog environments on Earth and then applied to extraterrestrial targets, ultimately, improving life detection efforts.
Plain Language Summary
Modern data sets have become large as a by‐product of the desire to discover unknown or characterize complex non‐linear relationships. Machine learning approaches are extremely valuable for tackling such problems; however, those methods are underutilized in astrobiology and therefore have not been refined for these types of data. Here, we employ machine learning approaches to predict organic carbon (OC) abundance from XRF‐derived elemental abundances of sediment cores. Sample sediment cores were acquired from three Mars‐analog hypersaline lakes (Canada) and, for comparison, one freshwater lake (Greenland). Overall, our models successfully predicted OC concentration, with average accuracies between 80% and 94% and root mean square errors within 1.0 wt% OC. Furthermore, we applied our model to Martian instruments including rover‐based and orbital data sets. We compute probability predictions that corroborate OC that has been measured on the Martian surface. Our study demonstrates the potential for machine learning methods to be employed to aid in life detection efforts.
Key Points
Predictive machine learning models have the potential to aid in life detection efforts beyond Earth using organic geochemical data sets
We predicted organic carbon (OC) abundance using only XRF‐derived elemental abundances with greater than 80% accuracy
Post‐hoc interpretation of our models highlights the importance of elements associated with clays in determining OC concentrations |
---|---|
ISSN: | 2993-5210 2993-5210 |
DOI: | 10.1029/2024JH000138 |