CEDG-GeoQA: Knowledge base question answering for the geoscience domain via Chinese entity description graph

Acquiring geoscience knowledge is crucial for advancing earth science research. Currently, geoscience knowledge can be obtained through search engines or specialized databases. However, the quality of search engine results varies, and geoscience databases do not support natural language queries. To...

Full description

Saved in:
Bibliographic Details
Published inEarth science informatics Vol. 17; no. 3; pp. 2609 - 2621
Main Authors Wei, Lai, Lu, Qinghua, Duan, Yilin, Yao, Hong, Kang, Xiaojun
Format Journal Article
LanguageEnglish
Published Berlin/Heidelberg Springer Berlin Heidelberg 01.06.2024
Springer Nature B.V
Subjects
Online AccessGet full text
ISSN1865-0473
1865-0481
DOI10.1007/s12145-024-01304-8

Cover

Loading…
More Information
Summary:Acquiring geoscience knowledge is crucial for advancing earth science research. Currently, geoscience knowledge can be obtained through search engines or specialized databases. However, the quality of search engine results varies, and geoscience databases do not support natural language queries. To address these challenges, Geoscience Question Answering (GeoQA) systems have been developed to provide answers to natural language queries. Much of the existing research in geoscience QA primarily focuses on geography, with other domains remaining relatively unexplored. To bridge this gap, our study introduces a Chinese geoscience QA dataset that covers a wide range of topics, including geography, climate, and culture. Additionally, we propose the CEDG-GeoQA framework for Chinese geoscience QA. The model begins by utilizing syntactic parsing to convert unstructured queries into an entity description graph (EDG). Subsequently, it aligns the EDG with a comprehensive geoscience knowledge base, extracting a subgraph centered around the subject entity. This subgraph is used to assess candidate answers and determine the most likely response. Our comprehensive experiments, conducted using a Chinese geo-knowledge base, demonstrate the superior performance of our method, achieving a 5% improvement in the F1 measure compared to existing baselines, including WDAqua, gAnswer, and NSQA.
Bibliography:ObjectType-Article-1
SourceType-Scholarly Journals-1
ObjectType-Feature-2
content type line 14
ISSN:1865-0473
1865-0481
DOI:10.1007/s12145-024-01304-8