Leveraging Machine Learning Approaches to Predict Organic Carbon Abundance in Mars‐Analog Hypersaline Lake Sediments

Modern advancements in laboratory and instrumental techniques in astrobiology have improved our life detection capabilities on both Earth and beyond. These advancements have also increased the complexity of data often resulting in data sets that are characterized by complex and non‐linear relationsh...

Full description

Saved in:
Bibliographic Details
Published inJournal of geophysical research. Machine learning and computation Vol. 1; no. 2
Main Authors Nichols, Floyd, Pontefract, Alexandra, Masterson, Andrew L., Thompson, Mia L., Carr, Christopher E., Tuccillo, Mia T., Osburn, Magdalena R.
Format Journal Article
LanguageEnglish
Published 01.06.2024
Subjects
Online AccessGet full text

Cover

Loading…
More Information
Summary:Modern advancements in laboratory and instrumental techniques in astrobiology have improved our life detection capabilities on both Earth and beyond. These advancements have also increased the complexity of data often resulting in data sets that are characterized by complex and non‐linear relationships. Machine learning methods are underutilized in astrobiology; however, these methods are extremely effective at revealing structure and patterns in complex data sets when paired with the right algorithms. Here, we employ a series of classification and regression algorithms to predict the abundance of organic carbon (OC) from X‐ray fluorescence (XRF) heavy element (>Mg) data in dynamic Mars‐analog hypersaline lake sediments. More specifically, we constructed models using the random forest, k‐nearest neighbors (KNN), support vector machine, and logistic regression algorithms. Overall, our trained models showed good performance with predicting the abundance of OC, with accuracies from 80% to 94%. Machine learning approaches such as classification and regression algorithms offer insight into complex data while providing agnostic insights, ultimately creating a more efficient search for OC. We applied our trained model on XRF data from Martian soil using rover‐based (PIXL) and orbital (Odyssey) data sets to produce probability predictions of OC abundance. Our predictions show a high probability that OC abundance is low which is comparable to OC data from recently landed missions. These results highlight the potential for predictive machine learning models to be trained on data from analog environments on Earth and then applied to extraterrestrial targets, ultimately, improving life detection efforts. Plain Language Summary Modern data sets have become large as a by‐product of the desire to discover unknown or characterize complex non‐linear relationships. Machine learning approaches are extremely valuable for tackling such problems; however, those methods are underutilized in astrobiology and therefore have not been refined for these types of data. Here, we employ machine learning approaches to predict organic carbon (OC) abundance from XRF‐derived elemental abundances of sediment cores. Sample sediment cores were acquired from three Mars‐analog hypersaline lakes (Canada) and, for comparison, one freshwater lake (Greenland). Overall, our models successfully predicted OC concentration, with average accuracies between 80% and 94% and root mean square errors within 1.0 wt% OC. Furthermore, we applied our model to Martian instruments including rover‐based and orbital data sets. We compute probability predictions that corroborate OC that has been measured on the Martian surface. Our study demonstrates the potential for machine learning methods to be employed to aid in life detection efforts. Key Points Predictive machine learning models have the potential to aid in life detection efforts beyond Earth using organic geochemical data sets We predicted organic carbon (OC) abundance using only XRF‐derived elemental abundances with greater than 80% accuracy Post‐hoc interpretation of our models highlights the importance of elements associated with clays in determining OC concentrations
ISSN:2993-5210
2993-5210
DOI:10.1029/2024JH000138