Building an Intelligent Data Exploring Assistant for Geoscientists
Advances in natural‐language processing and large language models (LLMs) are transforming how geoscientists interact with complex data sets, enabling efficient and intuitive scientific analyses. This study introduces the Intelligent Data Exploring Assistant (IDEA), a prototype software framework tha...
Saved in:
Published in | Journal of geophysical research. Machine learning and computation Vol. 2; no. 3 |
---|---|
Main Authors | , |
Format | Journal Article |
Language | English |
Published |
01.09.2025
|
Online Access | Get full text |
Cover
Loading…
Abstract | Advances in natural‐language processing and large language models (LLMs) are transforming how geoscientists interact with complex data sets, enabling efficient and intuitive scientific analyses. This study introduces the Intelligent Data Exploring Assistant (IDEA), a prototype software framework that integrates existing LLM technology with domain‐specific instructions, data, analytical tools, and computing resources to support geoscientific research. We demonstrate its application through the Station Explorer Assistant (SEA), a web‐based tool designed for sea level scientists. SEA empowers users to analyze and interpret coastal water level data by addressing challenges such as vertical datum conversions and assessing flooding risks. We also demonstrate the generalizability of building an IDEA, whereby we deploy a local instance of the framework to analyze atmospheric observations from Mars collected by NASA's InSight Mission. By combining LLM capabilities with robust domain‐specific customizations, SEA and the Mars IDEA generate accurate analyses, visualizations, and insights through natural‐language prompts. This study highlights the potential of IDEA frameworks to lower technical barriers, enhance educational opportunities, and transform geoscientific workflows while addressing the limitations and uncertainties of current LLM technology.
Artificial intelligence (AI) is transforming how scientists explore and understand our world. At the University of Hawaiʻi Sea Level Center (UHSLC), we are developing tools that use large language models, like what ChatGPT uses, to help scientists study sea level changes. One such tool, called the Station Explorer Assistant (SEA), allows researchers to ask questions in everyday language and receive clear explanations and data analyses in response. SEA uses AI to analyze sea level data, compare water levels to normal conditions, and predict potential flooding, drawing on the UHSLC's extensive database. It even writes and runs its own analysis software, which it shows the user to check that its results are accurate. By making sea level science easier to understand and access, SEA can support communities adapting to rising seas and other coastal challenges. SEA technology is generalizable across geoscience domains through a framework we call an Intelligent Data Exploring Assistant (IDEA), which we demonstrate by asking it to analyze wind observations from Mars. Our work highlights how AI can enhance scientific research and communication, and we envision similar tools being created to support scientists in many fields.
Large language models can assist geoscientists by generating data analyses and visualizations from natural‐language prompts A general‐purpose Intelligent Data Exploring Assistant shows the potential of artificial intelligence to enhance geoscience research The Station Explorer Assistant analyzes water level data from tide gauges providing insights into sea level variability and risks |
---|---|
AbstractList | Advances in natural‐language processing and large language models (LLMs) are transforming how geoscientists interact with complex data sets, enabling efficient and intuitive scientific analyses. This study introduces the Intelligent Data Exploring Assistant (IDEA), a prototype software framework that integrates existing LLM technology with domain‐specific instructions, data, analytical tools, and computing resources to support geoscientific research. We demonstrate its application through the Station Explorer Assistant (SEA), a web‐based tool designed for sea level scientists. SEA empowers users to analyze and interpret coastal water level data by addressing challenges such as vertical datum conversions and assessing flooding risks. We also demonstrate the generalizability of building an IDEA, whereby we deploy a local instance of the framework to analyze atmospheric observations from Mars collected by NASA's InSight Mission. By combining LLM capabilities with robust domain‐specific customizations, SEA and the Mars IDEA generate accurate analyses, visualizations, and insights through natural‐language prompts. This study highlights the potential of IDEA frameworks to lower technical barriers, enhance educational opportunities, and transform geoscientific workflows while addressing the limitations and uncertainties of current LLM technology.
Artificial intelligence (AI) is transforming how scientists explore and understand our world. At the University of Hawaiʻi Sea Level Center (UHSLC), we are developing tools that use large language models, like what ChatGPT uses, to help scientists study sea level changes. One such tool, called the Station Explorer Assistant (SEA), allows researchers to ask questions in everyday language and receive clear explanations and data analyses in response. SEA uses AI to analyze sea level data, compare water levels to normal conditions, and predict potential flooding, drawing on the UHSLC's extensive database. It even writes and runs its own analysis software, which it shows the user to check that its results are accurate. By making sea level science easier to understand and access, SEA can support communities adapting to rising seas and other coastal challenges. SEA technology is generalizable across geoscience domains through a framework we call an Intelligent Data Exploring Assistant (IDEA), which we demonstrate by asking it to analyze wind observations from Mars. Our work highlights how AI can enhance scientific research and communication, and we envision similar tools being created to support scientists in many fields.
Large language models can assist geoscientists by generating data analyses and visualizations from natural‐language prompts A general‐purpose Intelligent Data Exploring Assistant shows the potential of artificial intelligence to enhance geoscience research The Station Explorer Assistant analyzes water level data from tide gauges providing insights into sea level variability and risks |
Author | Widlansky, Matthew J. Komar, Nemanja |
Author_xml | – sequence: 1 givenname: Matthew J. orcidid: 0000-0002-3765-7327 surname: Widlansky fullname: Widlansky, Matthew J. organization: School of Ocean and Earth Science and Technology (SOEST) Cooperative Institute for Marine and Atmospheric Research University of Hawaiʻi at Mānoa Honolulu HI USA, Department of Oceanography SOEST University of Hawaiʻi at Mānoa Honolulu HI USA – sequence: 2 givenname: Nemanja surname: Komar fullname: Komar, Nemanja organization: School of Ocean and Earth Science and Technology (SOEST) Cooperative Institute for Marine and Atmospheric Research University of Hawaiʻi at Mānoa Honolulu HI USA |
BookMark | eNpNUMFOwzAUi9CQGGM3PqAfQOElL1ma4zZgG5rEZfcqTV-moJJOSZHg7-kEh51s2ZZl-ZZNYh-JsXsOjxyEeRIg1NsWABbSXLGpMAZLJThMLvgNm-f8MWYQBVSgp2y1-gpdG-KxsLHYxYG6LhwpDsWzHWzx8n3q-nR2lzmHPNjR8H0qNtRnF8bYqOU7du1tl2n-jzN2eH05rLfl_n2zWy_3pdPKlNJI6XirJVnlQDbNuFM7g97yaiFQqZYqchobcFxIQkkaW_QVNbJ13hPO2MNfrUt9zol8fUrh06afmkN9fqC-fAB_AcopT6w |
Cites_doi | 10.3390/electronics13173417 10.1175/JCLI‐D‐16‐0836.1 10.18653/v1/2024.findings-emnlp.815 10.5281/zenodo.4124259 10.1029/2023CN000212 10.1111/exsy.13654 10.1038/d41586‐024‐02842‐3 10.18653/v1/2024.sicon-1.2 10.22541/essoar.168132856.66485758/v1 10.1126/science.adg7879 10.1126/science.abq1158 10.1038/d41586‐023‐00107‐z 10.1038/d41586-024-03070-5 10.1038/d41586‐024‐03905‐1 10.1038/d41586‐024‐01003‐w 10.1175/BAMS‐D‐24‐0157.1 10.1071/ES19024 10.1029/2023WR036288 10.1038/s41561-024-01475-5 10.1038/s41586‐024‐07421‐0 10.1038/d41586‐024‐03940‐y 10.1038/d41586‐022‐03479‐w 10.22541/essoar.174042987.76981404/v1 10.1029/2021GL095453 10.1038/s41561‐020‐0544‐y 10.48550/arXiv.2503.23037 10.1038/d41586‐024‐0 10.1007/s10462‐023‐10540‐1 10.1038/d41586‐022‐04383‐z 10.1007/s12371‐024‐01011‐2 |
ContentType | Journal Article |
DBID | AAYXX CITATION |
DOI | 10.1029/2025JH000649 |
DatabaseName | CrossRef |
DatabaseTitle | CrossRef |
DatabaseTitleList | CrossRef |
DeliveryMethod | fulltext_linktorsrc |
EISSN | 2993-5210 |
ExternalDocumentID | 10_1029_2025JH000649 |
GroupedDBID | 0R~ 24P AAMMB AAYXX ACCMX AEFGJ AGXDD AIDQK AIDYY ALMA_UNASSIGNED_HOLDINGS CITATION GROUPED_DOAJ M~E WIN |
ID | FETCH-LOGICAL-c759-4944c1d74ea5c04bb0647c93fa1862355de8ec73b0c124e34e73d3f8eb4dcffe3 |
ISSN | 2993-5210 |
IngestDate | Thu Jul 31 00:15:40 EDT 2025 |
IsDoiOpenAccess | false |
IsOpenAccess | true |
IsPeerReviewed | true |
IsScholarly | true |
Issue | 3 |
Language | English |
LinkModel | OpenURL |
MergedId | FETCHMERGED-LOGICAL-c759-4944c1d74ea5c04bb0647c93fa1862355de8ec73b0c124e34e73d3f8eb4dcffe3 |
ORCID | 0000-0002-3765-7327 |
OpenAccessLink | https://onlinelibrary.wiley.com/doi/pdfdirect/10.1029/2025JH000649 |
ParticipantIDs | crossref_primary_10_1029_2025JH000649 |
PublicationCentury | 2000 |
PublicationDate | 2025-09-00 |
PublicationDateYYYYMMDD | 2025-09-01 |
PublicationDate_xml | – month: 09 year: 2025 text: 2025-09-00 |
PublicationDecade | 2020 |
PublicationTitle | Journal of geophysical research. Machine learning and computation |
PublicationYear | 2025 |
References | CMEMS (e_1_2_8_16_1) 2025 e_1_2_8_24_1 e_1_2_8_47_1 e_1_2_8_26_1 e_1_2_8_49_1 Boerner T. J. (e_1_2_8_7_1) 2023 e_1_2_8_3_1 e_1_2_8_5_1 e_1_2_8_9_1 e_1_2_8_20_1 e_1_2_8_43_1 e_1_2_8_22_1 e_1_2_8_45_1 e_1_2_8_41_1 e_1_2_8_19_1 e_1_2_8_13_1 IOC (e_1_2_8_30_1) 2020; 1 e_1_2_8_36_1 e_1_2_8_15_1 e_1_2_8_38_1 e_1_2_8_32_1 e_1_2_8_34_1 e_1_2_8_51_1 e_1_2_8_29_1 e_1_2_8_46_1 e_1_2_8_27_1 e_1_2_8_48_1 Caldwell P. C. (e_1_2_8_11_1) 2015 Huang J. (e_1_2_8_28_1) 2023; 13 e_1_2_8_2_1 e_1_2_8_4_1 e_1_2_8_6_1 e_1_2_8_8_1 e_1_2_8_21_1 e_1_2_8_42_1 Hancock D. Y. (e_1_2_8_25_1) 2021 e_1_2_8_23_1 e_1_2_8_44_1 e_1_2_8_40_1 e_1_2_8_18_1 e_1_2_8_39_1 Conroy G. (e_1_2_8_17_1) 2024 e_1_2_8_14_1 e_1_2_8_35_1 e_1_2_8_37_1 e_1_2_8_10_1 e_1_2_8_31_1 e_1_2_8_12_1 e_1_2_8_33_1 e_1_2_8_52_1 e_1_2_8_50_1 |
References_xml | – ident: e_1_2_8_52_1 doi: 10.3390/electronics13173417 – ident: e_1_2_8_37_1 – ident: e_1_2_8_27_1 doi: 10.1175/JCLI‐D‐16‐0836.1 – start-page: 1 volume-title: Practice and experience in advanced research computing (PEARC ’21) year: 2021 ident: e_1_2_8_25_1 – ident: e_1_2_8_22_1 doi: 10.18653/v1/2024.findings-emnlp.815 – start-page: 4 volume-title: In practice and experience in advanced research computing (PEARC ’23) year: 2023 ident: e_1_2_8_7_1 – ident: e_1_2_8_36_1 doi: 10.5281/zenodo.4124259 – ident: e_1_2_8_19_1 doi: 10.1029/2023CN000212 – ident: e_1_2_8_4_1 – ident: e_1_2_8_23_1 doi: 10.1111/exsy.13654 – ident: e_1_2_8_13_1 doi: 10.1038/d41586‐024‐02842‐3 – ident: e_1_2_8_10_1 – volume: 13 start-page: 1148 issue: 4 year: 2023 ident: e_1_2_8_28_1 article-title: The role of ChatGPT in scientific communication: Writing better scientific review articles publication-title: American Journal of Cancer Research – ident: e_1_2_8_51_1 doi: 10.18653/v1/2024.sicon-1.2 – ident: e_1_2_8_44_1 doi: 10.22541/essoar.168132856.66485758/v1 – ident: e_1_2_8_46_1 doi: 10.1126/science.adg7879 – ident: e_1_2_8_32_1 doi: 10.1126/science.abq1158 – ident: e_1_2_8_45_1 doi: 10.1038/d41586‐023‐00107‐z – volume-title: Do AI models produce more original ideas than researchers? year: 2024 ident: e_1_2_8_17_1 doi: 10.1038/d41586-024-03070-5 – ident: e_1_2_8_15_1 – ident: e_1_2_8_3_1 doi: 10.1038/d41586‐024‐03905‐1 – ident: e_1_2_8_42_1 – volume-title: E. U. C. M. S. I. (CMEMS), marine data store (MDS) year: 2025 ident: e_1_2_8_16_1 – ident: e_1_2_8_26_1 doi: 10.1038/d41586‐024‐01003‐w – ident: e_1_2_8_34_1 – ident: e_1_2_8_48_1 – volume: 1 issue: 144 year: 2020 ident: e_1_2_8_30_1 article-title: Quality control of in situ sea level observations: A review and progress towards automated quality control publication-title: Manuals and guides – ident: e_1_2_8_9_1 doi: 10.1175/BAMS‐D‐24‐0157.1 – ident: e_1_2_8_24_1 doi: 10.1071/ES19024 – ident: e_1_2_8_43_1 – ident: e_1_2_8_33_1 – ident: e_1_2_8_20_1 doi: 10.1029/2023WR036288 – ident: e_1_2_8_2_1 doi: 10.1038/s41561-024-01475-5 – ident: e_1_2_8_47_1 – ident: e_1_2_8_18_1 doi: 10.1038/s41586‐024‐07421‐0 – ident: e_1_2_8_21_1 – ident: e_1_2_8_31_1 doi: 10.1038/d41586‐024‐03940‐y – ident: e_1_2_8_29_1 doi: 10.1038/d41586‐022‐03479‐w – ident: e_1_2_8_50_1 doi: 10.22541/essoar.174042987.76981404/v1 – ident: e_1_2_8_14_1 doi: 10.1029/2021GL095453 – ident: e_1_2_8_35_1 – ident: e_1_2_8_5_1 doi: 10.1038/s41561‐020‐0544‐y – ident: e_1_2_8_8_1 – ident: e_1_2_8_38_1 – ident: e_1_2_8_40_1 doi: 10.48550/arXiv.2503.23037 – ident: e_1_2_8_49_1 – ident: e_1_2_8_39_1 doi: 10.1038/d41586‐024‐0 – ident: e_1_2_8_41_1 doi: 10.1007/s10462‐023‐10540‐1 – volume-title: Sea level measured by tide gauges from global oceans — The joint archive for sea level holdings (NCEI accession 0019568), version 5.5 year: 2015 ident: e_1_2_8_11_1 – ident: e_1_2_8_12_1 doi: 10.1038/d41586‐022‐04383‐z – ident: e_1_2_8_6_1 doi: 10.1007/s12371‐024‐01011‐2 |
SSID | ssj0003320807 |
Score | 2.301641 |
Snippet | Advances in natural‐language processing and large language models (LLMs) are transforming how geoscientists interact with complex data sets, enabling efficient... |
SourceID | crossref |
SourceType | Index Database |
Title | Building an Intelligent Data Exploring Assistant for Geoscientists |
Volume | 2 |
hasFullText | 1 |
inHoldings | 1 |
isFullTextHit | |
isPrint | |
link | http://utb.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwnZ07T8MwFIUtHgsLAgHiLQ8wVSmp7TTJSBGoqgRTEd0qPymopFVJFwZ-O9eJ8wAFqbBEkZW6bb7IPrn2uRehC6N9IZgMPGEI9Zg0Es5k5DFtZNg1cUdH1uB8_9DtP7LBKBhV5Y4yd0kq2vKj0VfyH6rQBlytS_YPZMtOoQHOgS8cgTAcV2LcczWtWzxxfhCbWzMFlClvVbvrAIFViTYL02xho-C5CxLa3n_Rps96Ni8AunRAk7YtUjSxonRahFOcJ26-_L6e__SipjAD5lFZV1C8NWiXY_vM7ep-0G88eeX1wAMJyp1VbnwidusfzP75sopuaHMDLKk9R7Rx2PaJzXpqv2TQz1RSXE1PxZL8j1mr3EuYraKTeFz_9DraJPDaYCta3H9WMTdKiZ876Mvf6bwQ0MFVvYOaSqnJjeEO2nYs8HUOfRet6WQP9QrgmCe4Bhxb4LgEjkvgGIDjb8D30fDudnjT91wRDE-GgS0AyJjsqJBpHkifCWHNwTKmhnfgXRTEotKRliEVvgSlpinTIVXURFowJY3R9ABtJLNEHyJMbPZBzpTiNGaKdLlvQL9EklPJI6L8I3RZ_OfxPE91Mm66uccrXneCtqrH5hRtpIulPgMFl4rzLPJxnsH5AoihSOM |
linkProvider | ISSN International Centre |
openUrl | ctx_ver=Z39.88-2004&ctx_enc=info%3Aofi%2Fenc%3AUTF-8&rfr_id=info%3Asid%2Fsummon.serialssolutions.com&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Ajournal&rft.genre=article&rft.atitle=Building+an+Intelligent+Data+Exploring+Assistant+for+Geoscientists&rft.jtitle=Journal+of+geophysical+research.+Machine+learning+and+computation&rft.au=Widlansky%2C+Matthew+J.&rft.au=Komar%2C+Nemanja&rft.date=2025-09-01&rft.issn=2993-5210&rft.eissn=2993-5210&rft.volume=2&rft.issue=3&rft_id=info:doi/10.1029%2F2025JH000649&rft.externalDBID=n%2Fa&rft.externalDocID=10_1029_2025JH000649 |
thumbnail_l | http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/lc.gif&issn=2993-5210&client=summon |
thumbnail_m | http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/mc.gif&issn=2993-5210&client=summon |
thumbnail_s | http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/sc.gif&issn=2993-5210&client=summon |