Construction and evaluation of a domain-specific knowledge graph for knowledge discovery

Purpose This study aims to evaluate a method of building a biomedical knowledge graph (KG). Design/methodology/approach This research first constructs a COVID-19 KG on the COVID-19 Open Research Data Set, covering information over six categories (i.e. disease, drug, gene, species, therapy and sympto...

Full description

Saved in:
Bibliographic Details
Published inInformation discovery and delivery Vol. 51; no. 4; pp. 358 - 370
Main Authors Nguyen, Huyen, Chen, Haihua, Chen, Jiangping, Kargozari, Kate, Ding, Junhua
Format Journal Article
LanguageEnglish
Published Bingley Emerald Publishing Limited 24.11.2023
Emerald Group Publishing Limited
Subjects
Online AccessGet full text
ISSN2398-6247
2398-6247
2398-6255
DOI10.1108/IDD-06-2022-0054

Cover

Loading…
More Information
Summary:Purpose This study aims to evaluate a method of building a biomedical knowledge graph (KG). Design/methodology/approach This research first constructs a COVID-19 KG on the COVID-19 Open Research Data Set, covering information over six categories (i.e. disease, drug, gene, species, therapy and symptom). The construction used open-source tools to extract entities, relations and triples. Then, the COVID-19 KG is evaluated on three data-quality dimensions: correctness, relatedness and comprehensiveness, using a semiautomatic approach. Finally, this study assesses the application of the KG by building a question answering (Q&A) system. Five queries regarding COVID-19 genomes, symptoms, transmissions and therapeutics were submitted to the system and the results were analyzed. Findings With current extraction tools, the quality of the KG is moderate and difficult to improve, unless more efforts are made to improve the tools for entity extraction, relation extraction and others. This study finds that comprehensiveness and relatedness positively correlate with the data size. Furthermore, the results indicate the performances of the Q&A systems built on the larger-scale KGs are better than the smaller ones for most queries, proving the importance of relatedness and comprehensiveness to ensure the usefulness of the KG. Originality/value The KG construction process, data-quality-based and application-based evaluations discussed in this paper provide valuable references for KG researchers and practitioners to build high-quality domain-specific knowledge discovery systems.
Bibliography:ObjectType-Article-1
SourceType-Scholarly Journals-1
ObjectType-Feature-2
content type line 14
ISSN:2398-6247
2398-6247
2398-6255
DOI:10.1108/IDD-06-2022-0054