Explainable Machine Learning for Mapping Minerals From CRISM Hyperspectral Data

This paper addresses some of the challenges in automating mineral mapping from CRISM hyperspectral data using two ML algorithms, namely, Random Forest (RF) and Gradient Boosted Trees (GBTree) algorithms. An interpretable framework using tree‐ensemble classification and SHapely Additive exPlanations...

Full description

Saved in:
Bibliographic Details
Published inJournal of geophysical research. Machine learning and computation Vol. 2; no. 2
Main Authors Dhoundiyal, Sandeepan, Dey, Moni Shankar, Singh, Shashikant, Arun, Pattathal V., Thangjam, Guneshwar, Porwal, Alok
Format Journal Article
LanguageEnglish
Published Wiley 01.06.2025
Subjects
Online AccessGet full text

Cover

Loading…
More Information
Summary:This paper addresses some of the challenges in automating mineral mapping from CRISM hyperspectral data using two ML algorithms, namely, Random Forest (RF) and Gradient Boosted Trees (GBTree) algorithms. An interpretable framework using tree‐ensemble classification and SHapely Additive exPlanations (SHAP) is implemented to interpret the model decisions. SHAP explanations quantify the influence of diagnostic absorption features, demonstrating that classifiers rely on physically significant spectral features, rather than artifacts. Novel metrics: “Physically Significant Precision,” “Physically Significant Recall,” and “Physically Significant F‐measure” quantify the classifier's expected performance on unseen data. RF outperforms GBTree, and thus is used to develop a novel framework for mineral mapping from CRISM data, demonstrated on five CRISM datacubes. This amalgamation of Random Forest and SHAP addresses limitations associated with existing CRISM classification methods, offering stability during training, reduced manual intervention, and interpretability while achieving a Kappa (κ) $(\kappa )$ of 0.91 over the CRISM Machine Learning Toolkit's mineral data set with ∼ ${\sim} $470,000 labeled spectra. Plain Language Summary This paper describes a method for automating mineral detection from CRISM hyperspectral data, comparing Random Forest and Gradient Boosted Trees algorithms. It presents a method using tree‐ensemble algorithms for classification and SHapely Additive exPlanations (SHAP) for interpretability. SHAP explanations show classifiers prioritize physically significant spectral features over artifacts. Novel metrics: “Physically Significant Precision,” “Physically Significant Recall,” and “Physically Significant F‐measure” gauge classifiers' expected performance on unseen data. RF outperforms GBTree and thus is used to develop a new framework for mineral mapping from CRIM data, which is demonstrated on five CRISM datacubes. The combination of Random Forest and SHAP addresses limitations of existing CRISM classification methods, providing stability, reduced manual intervention, and interpretability. It achieves a Kappa (κ) $(\kappa )$ of 0.91 over ∼ ${\sim} $470,000 labeled spectra from the CRISM Machine Learning Toolkit's mineral data set. Key Points The effectiveness of the Random Forest and gradient‐boosted tree classifiers for identifying minerals from hyperspectral data is evaluated A novel metric, based on SHapely Additive exPlanations, to estimate a trained classifier's expected performance on unseen data is introduced A novel explainable framework for mapping minerals from CRISM hyperspectral data is introduced
ISSN:2993-5210
2993-5210
DOI:10.1029/2024JH000391