GCP: Graph Encoder With Content-Planning for Sentence Generation From Knowledge Bases

A knowledge base is a large repository of facts usually represented as triples, each consisting of a subject, a predicate, and an object. The triples together form a graph, i.e., a knowledge graph . The triple representation in a knowledge graph offers a simple interface for applications to access t...

Full description

Saved in:
Bibliographic Details
Published inIEEE transactions on pattern analysis and machine intelligence Vol. 44; no. 11; pp. 7521 - 7533
Main Authors Trisedya, Bayu Distiawan, Qi, Jianzhong, Wang, Wei, Zhang, Rui
Format Journal Article
LanguageEnglish
Published New York IEEE 01.11.2022
The Institute of Electrical and Electronics Engineers, Inc. (IEEE)
Subjects
Online AccessGet full text

Cover

Loading…
More Information
Summary:A knowledge base is a large repository of facts usually represented as triples, each consisting of a subject, a predicate, and an object. The triples together form a graph, i.e., a knowledge graph . The triple representation in a knowledge graph offers a simple interface for applications to access the facts. However, this representation is not in a natural language form, which is difficult for humans to understand. We address this problem by proposing a system to translate a set of triples (i.e., a graph) into natural sentences. We take an encoder-decoder based approach. Specifically, we propose a G raph encoder with C ontent- P lanning capability ( GCP ) to encode an input graph. GCP not only works as an encoder but also serves as a content-planner by using an entity-order aware topological traversal to encode a graph. This way, GCP can capture the relationships between entities in a knowledge graph as well as providing information regarding the proper entity order for the decoder. Hence, the decoder can generate sentences with a proper entity mention ordering. Experimental results show that GCP achieves improvements over state-of-the-art models by up to <inline-formula><tex-math notation="LaTeX">3.6\%</tex-math> <mml:math><mml:mrow><mml:mn>3</mml:mn><mml:mo>.</mml:mo><mml:mn>6</mml:mn><mml:mo>%</mml:mo></mml:mrow></mml:math><inline-graphic xlink:href="zhang-ieq1-3118703.gif"/> </inline-formula>, <inline-formula><tex-math notation="LaTeX">4.1\%</tex-math> <mml:math><mml:mrow><mml:mn>4</mml:mn><mml:mo>.</mml:mo><mml:mn>1</mml:mn><mml:mo>%</mml:mo></mml:mrow></mml:math><inline-graphic xlink:href="zhang-ieq2-3118703.gif"/> </inline-formula>, and <inline-formula><tex-math notation="LaTeX">3.8\%</tex-math> <mml:math><mml:mrow><mml:mn>3</mml:mn><mml:mo>.</mml:mo><mml:mn>8</mml:mn><mml:mo>%</mml:mo></mml:mrow></mml:math><inline-graphic xlink:href="zhang-ieq3-3118703.gif"/> </inline-formula> in three common metrics BLEU, METEOR, and TER, respectively. The code is available at ( https://github.com/ruizhang-ai/GCP/ )
Bibliography:ObjectType-Article-1
SourceType-Scholarly Journals-1
ObjectType-Feature-2
content type line 14
content type line 23
ISSN:0162-8828
1939-3539
2160-9292
1939-3539
DOI:10.1109/TPAMI.2021.3118703