GCP: Graph Encoder With Content-Planning for Sentence Generation From Knowledge Bases

A knowledge base is a large repository of facts usually represented as triples, each consisting of a subject, a predicate, and an object. The triples together form a graph, i.e., a knowledge graph . The triple representation in a knowledge graph offers a simple interface for applications to access t...

Full description

Saved in:

Bibliographic Details
Published in	IEEE transactions on pattern analysis and machine intelligence Vol. 44; no. 11; pp. 7521 - 7533
Main Authors	Trisedya, Bayu Distiawan, Qi, Jianzhong, Wang, Wei, Zhang, Rui
Format	Journal Article
Language	English
Published	New York IEEE 01.11.2022 The Institute of Electrical and Electronics Engineers, Inc. (IEEE)
Subjects	Aggregates Coders Data models Decoding Encoders-Decoders Graphical representations Internet knowledge base Knowledge based systems Knowledge bases (artificial intelligence) Knowledge representation Natural language processing Neural networks Sentences Transformers triple-to-text generation
Online Access	Get full text

Cover

Loading…

More Information
Summary:	A knowledge base is a large repository of facts usually represented as triples, each consisting of a subject, a predicate, and an object. The triples together form a graph, i.e., a knowledge graph . The triple representation in a knowledge graph offers a simple interface for applications to access the facts. However, this representation is not in a natural language form, which is difficult for humans to understand. We address this problem by proposing a system to translate a set of triples (i.e., a graph) into natural sentences. We take an encoder-decoder based approach. Specifically, we propose a G raph encoder with C ontent- P lanning capability ( GCP ) to encode an input graph. GCP not only works as an encoder but also serves as a content-planner by using an entity-order aware topological traversal to encode a graph. This way, GCP can capture the relationships between entities in a knowledge graph as well as providing information regarding the proper entity order for the decoder. Hence, the decoder can generate sentences with a proper entity mention ordering. Experimental results show that GCP achieves improvements over state-of-the-art models by up to <inline-formula><tex-math notation="LaTeX">3.6\%</tex-math> <mml:math><mml:mrow><mml:mn>3</mml:mn><mml:mo>.</mml:mo><mml:mn>6</mml:mn><mml:mo>%</mml:mo></mml:mrow></mml:math><inline-graphic xlink:href="zhang-ieq1-3118703.gif"/> </inline-formula>, <inline-formula><tex-math notation="LaTeX">4.1\%</tex-math> <mml:math><mml:mrow><mml:mn>4</mml:mn><mml:mo>.</mml:mo><mml:mn>1</mml:mn><mml:mo>%</mml:mo></mml:mrow></mml:math><inline-graphic xlink:href="zhang-ieq2-3118703.gif"/> </inline-formula>, and <inline-formula><tex-math notation="LaTeX">3.8\%</tex-math> <mml:math><mml:mrow><mml:mn>3</mml:mn><mml:mo>.</mml:mo><mml:mn>8</mml:mn><mml:mo>%</mml:mo></mml:mrow></mml:math><inline-graphic xlink:href="zhang-ieq3-3118703.gif"/> </inline-formula> in three common metrics BLEU, METEOR, and TER, respectively. The code is available at ( https://github.com/ruizhang-ai/GCP/ )
Bibliography:	ObjectType-Article-1 SourceType-Scholarly Journals-1 ObjectType-Feature-2 content type line 14 content type line 23
ISSN:	0162-8828 1939-3539 2160-9292 1939-3539
DOI:	10.1109/TPAMI.2021.3118703