Open Domain Knowledge Extraction for Knowledge Graphs

The quality of a knowledge graph directly impacts the quality of downstream applications (e.g. the number of answerable questions using the graph). One ongoing challenge when building a knowledge graph is to ensure completeness and freshness of the graph's entities and facts. In this paper, we...

Full description

Saved in:
Bibliographic Details
Main Authors Qian, Kun, Belyi, Anton, Wu, Fei, Khorshidi, Samira, Nikfarjam, Azadeh, Khot, Rahul, Sang, Yisi, Luna, Katherine, Chu, Xianqi, Choi, Eric, Govind, Yash, Seivwright, Chloe, Sun, Yiwen, Fakhry, Ahmed, Rekatsinas, Theo, Ilyas, Ihab, Qi, Xiaoguang, Li, Yunyao
Format Journal Article
LanguageEnglish
Published 30.10.2023
Subjects
Online AccessGet full text

Cover

Loading…
Abstract The quality of a knowledge graph directly impacts the quality of downstream applications (e.g. the number of answerable questions using the graph). One ongoing challenge when building a knowledge graph is to ensure completeness and freshness of the graph's entities and facts. In this paper, we introduce ODKE, a scalable and extensible framework that sources high-quality entities and facts from open web at scale. ODKE utilizes a wide range of extraction models and supports both streaming and batch processing at different latency. We reflect on the challenges and design decisions made and share lessons learned when building and deploying ODKE to grow an industry-scale open domain knowledge graph.
AbstractList The quality of a knowledge graph directly impacts the quality of downstream applications (e.g. the number of answerable questions using the graph). One ongoing challenge when building a knowledge graph is to ensure completeness and freshness of the graph's entities and facts. In this paper, we introduce ODKE, a scalable and extensible framework that sources high-quality entities and facts from open web at scale. ODKE utilizes a wide range of extraction models and supports both streaming and batch processing at different latency. We reflect on the challenges and design decisions made and share lessons learned when building and deploying ODKE to grow an industry-scale open domain knowledge graph.
Author Sang, Yisi
Qian, Kun
Chu, Xianqi
Khorshidi, Samira
Sun, Yiwen
Govind, Yash
Luna, Katherine
Belyi, Anton
Rekatsinas, Theo
Khot, Rahul
Choi, Eric
Wu, Fei
Ilyas, Ihab
Li, Yunyao
Qi, Xiaoguang
Fakhry, Ahmed
Nikfarjam, Azadeh
Seivwright, Chloe
Author_xml – sequence: 1
  givenname: Kun
  surname: Qian
  fullname: Qian, Kun
– sequence: 2
  givenname: Anton
  surname: Belyi
  fullname: Belyi, Anton
– sequence: 3
  givenname: Fei
  surname: Wu
  fullname: Wu, Fei
– sequence: 4
  givenname: Samira
  surname: Khorshidi
  fullname: Khorshidi, Samira
– sequence: 5
  givenname: Azadeh
  surname: Nikfarjam
  fullname: Nikfarjam, Azadeh
– sequence: 6
  givenname: Rahul
  surname: Khot
  fullname: Khot, Rahul
– sequence: 7
  givenname: Yisi
  surname: Sang
  fullname: Sang, Yisi
– sequence: 8
  givenname: Katherine
  surname: Luna
  fullname: Luna, Katherine
– sequence: 9
  givenname: Xianqi
  surname: Chu
  fullname: Chu, Xianqi
– sequence: 10
  givenname: Eric
  surname: Choi
  fullname: Choi, Eric
– sequence: 11
  givenname: Yash
  surname: Govind
  fullname: Govind, Yash
– sequence: 12
  givenname: Chloe
  surname: Seivwright
  fullname: Seivwright, Chloe
– sequence: 13
  givenname: Yiwen
  surname: Sun
  fullname: Sun, Yiwen
– sequence: 14
  givenname: Ahmed
  surname: Fakhry
  fullname: Fakhry, Ahmed
– sequence: 15
  givenname: Theo
  surname: Rekatsinas
  fullname: Rekatsinas, Theo
– sequence: 16
  givenname: Ihab
  surname: Ilyas
  fullname: Ilyas, Ihab
– sequence: 17
  givenname: Xiaoguang
  surname: Qi
  fullname: Qi, Xiaoguang
– sequence: 18
  givenname: Yunyao
  surname: Li
  fullname: Li, Yunyao
BackLink https://doi.org/10.48550/arXiv.2312.09424$$DView paper in arXiv
BookMark eNpNzrcOwjAUhWEPMNAegIm8QIJr7IyILpBY2KMbF4gEdmQQ5e0RZWA6wy8dfV3U8sFbhIYEZ1wJgccQH_Uto4zQDBec8g4Su8b6ZBbOUPtk48P9ZM3BJvPHNYK-1sEnLsS_sIzQHC991HZwutjBb3tov5jvp6t0u1uup5NtCrnkKTWs0oYKSXghMdVcMWdAVU7noKxmkEPOGQdiMBYWC6iEdJZiqiQURGvWQ6Pv7cddNrE-Q3yWb3_58bMXGNlBcg
ContentType Journal Article
Copyright http://creativecommons.org/licenses/by/4.0
Copyright_xml – notice: http://creativecommons.org/licenses/by/4.0
DBID AKY
GOX
DOI 10.48550/arxiv.2312.09424
DatabaseName arXiv Computer Science
arXiv.org
DatabaseTitleList
Database_xml – sequence: 1
  dbid: GOX
  name: arXiv.org
  url: http://arxiv.org/find
  sourceTypes: Open Access Repository
DeliveryMethod fulltext_linktorsrc
ExternalDocumentID 2312_09424
GroupedDBID AKY
GOX
ID FETCH-LOGICAL-a674-2d3bcd257149702c483fda8bfc6a8ec3a6a6434a1d005e05ab57fe20287a91cc3
IEDL.DBID GOX
IngestDate Mon Jan 08 05:45:29 EST 2024
IsDoiOpenAccess true
IsOpenAccess true
IsPeerReviewed false
IsScholarly false
Language English
LinkModel DirectLink
MergedId FETCHMERGED-LOGICAL-a674-2d3bcd257149702c483fda8bfc6a8ec3a6a6434a1d005e05ab57fe20287a91cc3
OpenAccessLink https://arxiv.org/abs/2312.09424
ParticipantIDs arxiv_primary_2312_09424
PublicationCentury 2000
PublicationDate 2023-10-30
PublicationDateYYYYMMDD 2023-10-30
PublicationDate_xml – month: 10
  year: 2023
  text: 2023-10-30
  day: 30
PublicationDecade 2020
PublicationYear 2023
Score 1.90166
SecondaryResourceType preprint
Snippet The quality of a knowledge graph directly impacts the quality of downstream applications (e.g. the number of answerable questions using the graph). One ongoing...
SourceID arxiv
SourceType Open Access Repository
SubjectTerms Computer Science - Artificial Intelligence
Computer Science - Computation and Language
Title Open Domain Knowledge Extraction for Knowledge Graphs
URI https://arxiv.org/abs/2312.09424
hasFullText 1
inHoldings 1
isFullTextHit
isPrint
link http://utb.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwdV25TgMxEB0lqWgQCFA45YLWsOtrvSWCHAIBTZC2W_mUKDi0CVE-n1nvRqShtS1L4-u9Z8-MAa5RIztkqhkNvpRUlCxSa7WjLudROUS8PAXSPr-o-Zt4rGQ1ALKNhTHN5n3d5Qe2y1skH-wGBQgTQxgy1rpszV6r7nEypeLq2_-1Q46ZinZAYnoA-z27I3fddBzCIHwegWydNsjD1weKcPK0vcMik82q6aIKCBLHnYpZm0N6eQyL6WRxP6f9bwXUqEJQ5rl1HjcASo4iY05oHr3RNjpldHDcKIPgL0zucd2HTBorixgYwnthytw5fgIjFPxhDKT0gSkbo3UB8bXgOuZe6uC9K4PH3k9hnGysv7uEFHVrfp3MP_u_6hz22q_S07mbXcBo1fyESwTUlb1Ko_oLIUt1ug
link.rule.ids 228,230,783,888
linkProvider Cornell University
openUrl ctx_ver=Z39.88-2004&ctx_enc=info%3Aofi%2Fenc%3AUTF-8&rfr_id=info%3Asid%2Fsummon.serialssolutions.com&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Ajournal&rft.genre=article&rft.atitle=Open+Domain+Knowledge+Extraction+for+Knowledge+Graphs&rft.au=Qian%2C+Kun&rft.au=Belyi%2C+Anton&rft.au=Wu%2C+Fei&rft.au=Khorshidi%2C+Samira&rft.date=2023-10-30&rft_id=info:doi/10.48550%2Farxiv.2312.09424&rft.externalDocID=2312_09424