GCP: Graph Encoder With Content-Planning for Sentence Generation From Knowledge Bases
A knowledge base is a large repository of facts usually represented as triples, each consisting of a subject, a predicate, and an object. The triples together form a graph, i.e., a knowledge graph . The triple representation in a knowledge graph offers a simple interface for applications to access t...
Saved in:
Published in | IEEE transactions on pattern analysis and machine intelligence Vol. 44; no. 11; pp. 7521 - 7533 |
---|---|
Main Authors | , , , |
Format | Journal Article |
Language | English |
Published |
New York
IEEE
01.11.2022
The Institute of Electrical and Electronics Engineers, Inc. (IEEE) |
Subjects | |
Online Access | Get full text |
Cover
Loading…
Summary: | A knowledge base is a large repository of facts usually represented as triples, each consisting of a subject, a predicate, and an object. The triples together form a graph, i.e., a knowledge graph . The triple representation in a knowledge graph offers a simple interface for applications to access the facts. However, this representation is not in a natural language form, which is difficult for humans to understand. We address this problem by proposing a system to translate a set of triples (i.e., a graph) into natural sentences. We take an encoder-decoder based approach. Specifically, we propose a G raph encoder with C ontent- P lanning capability ( GCP ) to encode an input graph. GCP not only works as an encoder but also serves as a content-planner by using an entity-order aware topological traversal to encode a graph. This way, GCP can capture the relationships between entities in a knowledge graph as well as providing information regarding the proper entity order for the decoder. Hence, the decoder can generate sentences with a proper entity mention ordering. Experimental results show that GCP achieves improvements over state-of-the-art models by up to <inline-formula><tex-math notation="LaTeX">3.6\%</tex-math> <mml:math><mml:mrow><mml:mn>3</mml:mn><mml:mo>.</mml:mo><mml:mn>6</mml:mn><mml:mo>%</mml:mo></mml:mrow></mml:math><inline-graphic xlink:href="zhang-ieq1-3118703.gif"/> </inline-formula>, <inline-formula><tex-math notation="LaTeX">4.1\%</tex-math> <mml:math><mml:mrow><mml:mn>4</mml:mn><mml:mo>.</mml:mo><mml:mn>1</mml:mn><mml:mo>%</mml:mo></mml:mrow></mml:math><inline-graphic xlink:href="zhang-ieq2-3118703.gif"/> </inline-formula>, and <inline-formula><tex-math notation="LaTeX">3.8\%</tex-math> <mml:math><mml:mrow><mml:mn>3</mml:mn><mml:mo>.</mml:mo><mml:mn>8</mml:mn><mml:mo>%</mml:mo></mml:mrow></mml:math><inline-graphic xlink:href="zhang-ieq3-3118703.gif"/> </inline-formula> in three common metrics BLEU, METEOR, and TER, respectively. The code is available at ( https://github.com/ruizhang-ai/GCP/ ) |
---|---|
Bibliography: | ObjectType-Article-1 SourceType-Scholarly Journals-1 ObjectType-Feature-2 content type line 14 content type line 23 |
ISSN: | 0162-8828 1939-3539 2160-9292 1939-3539 |
DOI: | 10.1109/TPAMI.2021.3118703 |