Open Knowledge Graphs Canonicalization using Variational Autoencoders
Noun phrases and Relation phrases in open knowledge graphs are not canonicalized, leading to an explosion of redundant and ambiguous subject-relation-object triples. Existing approaches to solve this problem take a two-step approach. First, they generate embedding representations for both noun and r...
Saved in:
Main Authors | , , , , |
---|---|
Format | Journal Article |
Language | English |
Published |
08.12.2020
|
Subjects | |
Online Access | Get full text |
Cover
Loading…
Summary: | Noun phrases and Relation phrases in open knowledge graphs are not
canonicalized, leading to an explosion of redundant and ambiguous
subject-relation-object triples. Existing approaches to solve this problem take
a two-step approach. First, they generate embedding representations for both
noun and relation phrases, then a clustering algorithm is used to group them
using the embeddings as features. In this work, we propose Canonicalizing Using
Variational Autoencoders (CUVA), a joint model to learn both embeddings and
cluster assignments in an end-to-end approach, which leads to a better vector
representation for the noun and relation phrases. Our evaluation over multiple
benchmarks shows that CUVA outperforms the existing state-of-the-art
approaches. Moreover, we introduce CanonicNell, a novel dataset to evaluate
entity canonicalization systems. |
---|---|
DOI: | 10.48550/arxiv.2012.04780 |