CEDG-GeoQA: Knowledge base question answering for the geoscience domain via Chinese entity description graph

Acquiring geoscience knowledge is crucial for advancing earth science research. Currently, geoscience knowledge can be obtained through search engines or specialized databases. However, the quality of search engine results varies, and geoscience databases do not support natural language queries. To...

Full description

Saved in:

Bibliographic Details
Published in	Earth science informatics Vol. 17; no. 3; pp. 2609 - 2621
Main Authors	Wei, Lai, Lu, Qinghua, Duan, Yilin, Yao, Hong, Kang, Xiaojun
Format	Journal Article
Language	English
Published	Berlin/Heidelberg Springer Berlin Heidelberg 01.06.2024 Springer Nature B.V
Subjects	Earth and Environmental Science Earth science Earth science research Earth Sciences Earth System Sciences Geography Graph theory Information Systems Applications (incl.Internet) Knowledge bases (artificial intelligence) Natural language Ontology Queries Questions Scientific research Search engines Simulation and Modeling Space Exploration and Astronautics Space Sciences (including Extraterrestrial Physics Semantic parsing Geoscience domain Geoscience question answering Knowledge graph
Online Access	Get full text
ISSN	1865-0473 1865-0481
DOI	10.1007/s12145-024-01304-8

Cover

Loading…

More Information
Summary:	Acquiring geoscience knowledge is crucial for advancing earth science research. Currently, geoscience knowledge can be obtained through search engines or specialized databases. However, the quality of search engine results varies, and geoscience databases do not support natural language queries. To address these challenges, Geoscience Question Answering (GeoQA) systems have been developed to provide answers to natural language queries. Much of the existing research in geoscience QA primarily focuses on geography, with other domains remaining relatively unexplored. To bridge this gap, our study introduces a Chinese geoscience QA dataset that covers a wide range of topics, including geography, climate, and culture. Additionally, we propose the CEDG-GeoQA framework for Chinese geoscience QA. The model begins by utilizing syntactic parsing to convert unstructured queries into an entity description graph (EDG). Subsequently, it aligns the EDG with a comprehensive geoscience knowledge base, extracting a subgraph centered around the subject entity. This subgraph is used to assess candidate answers and determine the most likely response. Our comprehensive experiments, conducted using a Chinese geo-knowledge base, demonstrate the superior performance of our method, achieving a 5% improvement in the F1 measure compared to existing baselines, including WDAqua, gAnswer, and NSQA.
Bibliography:	ObjectType-Article-1 SourceType-Scholarly Journals-1 ObjectType-Feature-2 content type line 14
ISSN:	1865-0473 1865-0481
DOI:	10.1007/s12145-024-01304-8