Inferring End‐Members From Geoscience Data Using Simplex Projected Gradient Descent‐Archetypal Analysis

End‐member mixing analysis (EMMA) is widely used to analyze geoscience data for their end‐members and mixing proportions. Many traditional EMMA methods depend on known end‐members, which are sometimes uncertain or unknown. Unsupervised EMMA methods infer end‐members from data, but many existing ones...

Full description

Saved in:
Bibliographic Details
Published inJournal of geophysical research. Machine learning and computation Vol. 2; no. 2
Main Authors Wang, Zanchenling, Wen, Tao
Format Journal Article
LanguageEnglish
Published Wiley 01.06.2025
Subjects
Online AccessGet full text

Cover

Loading…
More Information
Summary:End‐member mixing analysis (EMMA) is widely used to analyze geoscience data for their end‐members and mixing proportions. Many traditional EMMA methods depend on known end‐members, which are sometimes uncertain or unknown. Unsupervised EMMA methods infer end‐members from data, but many existing ones don't strictly follow necessary constraints and lack full mathematical interpretability. Here, we introduce a novel unsupervised machine learning method, simplex projected gradient descent‐archetypal analysis (SPGD‐AA), which uses the ML model archetypal analysis to infer end‐members intuitively and interpretably without prior knowledge. SPGD‐AA uses extreme corners in data as end‐members or “archetypes,” and represents data as mixtures of end‐members. This method is most suitable for linear (conservative) mixing problems when samples with similar characteristics to end‐members are present in data. Validation on synthetic and real data sets, including river chemistry, deep‐sea sediment elemental composition, and hyperspectral imaging, shows that SPGD‐AA effectively recovers end‐members consistent with domain expertise and outperforms conventional approaches. SPGD‐AA is applicable to a wide range of geoscience data sets and beyond. Plain Language Summary Earth's materials (e.g., rock, soil, and water) are often mixtures of different sources. We developed a method, simplex projected gradient‐archetypal analysis, which allows computers to automatically identify these sources from mixture data by identifying extreme values. We tested our method on artificial data and real‐world data sets of river solutes, deep‐sea sediments, and airborne images. Our method is easy to use and can be applied to various geoscience data sets and beyond. Key Points We introduce simplex projected gradient descent‐archetypal analysis (SPGD‐AA), an unsupervised machine learning model for inferring end‐members from mixed geoscience data SPGD‐AA is intuitive, interpretable, rigorous, broadly applicable, and outperforms conventional methods on diverse data sets
ISSN:2993-5210
2993-5210
DOI:10.1029/2024JH000540