Open Domain Knowledge Extraction for Knowledge Graphs
The quality of a knowledge graph directly impacts the quality of downstream applications (e.g. the number of answerable questions using the graph). One ongoing challenge when building a knowledge graph is to ensure completeness and freshness of the graph's entities and facts. In this paper, we...
Saved in:
Main Authors | , , , , , , , , , , , , , , , , , |
---|---|
Format | Journal Article |
Language | English |
Published |
30.10.2023
|
Subjects | |
Online Access | Get full text |
Cover
Loading…
Abstract | The quality of a knowledge graph directly impacts the quality of downstream
applications (e.g. the number of answerable questions using the graph). One
ongoing challenge when building a knowledge graph is to ensure completeness and
freshness of the graph's entities and facts. In this paper, we introduce ODKE,
a scalable and extensible framework that sources high-quality entities and
facts from open web at scale. ODKE utilizes a wide range of extraction models
and supports both streaming and batch processing at different latency. We
reflect on the challenges and design decisions made and share lessons learned
when building and deploying ODKE to grow an industry-scale open domain
knowledge graph. |
---|---|
AbstractList | The quality of a knowledge graph directly impacts the quality of downstream
applications (e.g. the number of answerable questions using the graph). One
ongoing challenge when building a knowledge graph is to ensure completeness and
freshness of the graph's entities and facts. In this paper, we introduce ODKE,
a scalable and extensible framework that sources high-quality entities and
facts from open web at scale. ODKE utilizes a wide range of extraction models
and supports both streaming and batch processing at different latency. We
reflect on the challenges and design decisions made and share lessons learned
when building and deploying ODKE to grow an industry-scale open domain
knowledge graph. |
Author | Sang, Yisi Qian, Kun Chu, Xianqi Khorshidi, Samira Sun, Yiwen Govind, Yash Luna, Katherine Belyi, Anton Rekatsinas, Theo Khot, Rahul Choi, Eric Wu, Fei Ilyas, Ihab Li, Yunyao Qi, Xiaoguang Fakhry, Ahmed Nikfarjam, Azadeh Seivwright, Chloe |
Author_xml | – sequence: 1 givenname: Kun surname: Qian fullname: Qian, Kun – sequence: 2 givenname: Anton surname: Belyi fullname: Belyi, Anton – sequence: 3 givenname: Fei surname: Wu fullname: Wu, Fei – sequence: 4 givenname: Samira surname: Khorshidi fullname: Khorshidi, Samira – sequence: 5 givenname: Azadeh surname: Nikfarjam fullname: Nikfarjam, Azadeh – sequence: 6 givenname: Rahul surname: Khot fullname: Khot, Rahul – sequence: 7 givenname: Yisi surname: Sang fullname: Sang, Yisi – sequence: 8 givenname: Katherine surname: Luna fullname: Luna, Katherine – sequence: 9 givenname: Xianqi surname: Chu fullname: Chu, Xianqi – sequence: 10 givenname: Eric surname: Choi fullname: Choi, Eric – sequence: 11 givenname: Yash surname: Govind fullname: Govind, Yash – sequence: 12 givenname: Chloe surname: Seivwright fullname: Seivwright, Chloe – sequence: 13 givenname: Yiwen surname: Sun fullname: Sun, Yiwen – sequence: 14 givenname: Ahmed surname: Fakhry fullname: Fakhry, Ahmed – sequence: 15 givenname: Theo surname: Rekatsinas fullname: Rekatsinas, Theo – sequence: 16 givenname: Ihab surname: Ilyas fullname: Ilyas, Ihab – sequence: 17 givenname: Xiaoguang surname: Qi fullname: Qi, Xiaoguang – sequence: 18 givenname: Yunyao surname: Li fullname: Li, Yunyao |
BackLink | https://doi.org/10.48550/arXiv.2312.09424$$DView paper in arXiv |
BookMark | eNpNzrcOwjAUhWEPMNAegIm8QIJr7IyILpBY2KMbF4gEdmQQ5e0RZWA6wy8dfV3U8sFbhIYEZ1wJgccQH_Uto4zQDBec8g4Su8b6ZBbOUPtk48P9ZM3BJvPHNYK-1sEnLsS_sIzQHC991HZwutjBb3tov5jvp6t0u1uup5NtCrnkKTWs0oYKSXghMdVcMWdAVU7noKxmkEPOGQdiMBYWC6iEdJZiqiQURGvWQ6Pv7cddNrE-Q3yWb3_58bMXGNlBcg |
ContentType | Journal Article |
Copyright | http://creativecommons.org/licenses/by/4.0 |
Copyright_xml | – notice: http://creativecommons.org/licenses/by/4.0 |
DBID | AKY GOX |
DOI | 10.48550/arxiv.2312.09424 |
DatabaseName | arXiv Computer Science arXiv.org |
DatabaseTitleList | |
Database_xml | – sequence: 1 dbid: GOX name: arXiv.org url: http://arxiv.org/find sourceTypes: Open Access Repository |
DeliveryMethod | fulltext_linktorsrc |
ExternalDocumentID | 2312_09424 |
GroupedDBID | AKY GOX |
ID | FETCH-LOGICAL-a674-2d3bcd257149702c483fda8bfc6a8ec3a6a6434a1d005e05ab57fe20287a91cc3 |
IEDL.DBID | GOX |
IngestDate | Mon Jan 08 05:45:29 EST 2024 |
IsDoiOpenAccess | true |
IsOpenAccess | true |
IsPeerReviewed | false |
IsScholarly | false |
Language | English |
LinkModel | DirectLink |
MergedId | FETCHMERGED-LOGICAL-a674-2d3bcd257149702c483fda8bfc6a8ec3a6a6434a1d005e05ab57fe20287a91cc3 |
OpenAccessLink | https://arxiv.org/abs/2312.09424 |
ParticipantIDs | arxiv_primary_2312_09424 |
PublicationCentury | 2000 |
PublicationDate | 2023-10-30 |
PublicationDateYYYYMMDD | 2023-10-30 |
PublicationDate_xml | – month: 10 year: 2023 text: 2023-10-30 day: 30 |
PublicationDecade | 2020 |
PublicationYear | 2023 |
Score | 1.90166 |
SecondaryResourceType | preprint |
Snippet | The quality of a knowledge graph directly impacts the quality of downstream
applications (e.g. the number of answerable questions using the graph). One
ongoing... |
SourceID | arxiv |
SourceType | Open Access Repository |
SubjectTerms | Computer Science - Artificial Intelligence Computer Science - Computation and Language |
Title | Open Domain Knowledge Extraction for Knowledge Graphs |
URI | https://arxiv.org/abs/2312.09424 |
hasFullText | 1 |
inHoldings | 1 |
isFullTextHit | |
isPrint | |
link | http://utb.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwdV25TgMxEB0lqWgQCFA45YLWsOtrvSWCHAIBTZC2W_mUKDi0CVE-n1nvRqShtS1L4-u9Z8-MAa5RIztkqhkNvpRUlCxSa7WjLudROUS8PAXSPr-o-Zt4rGQ1ALKNhTHN5n3d5Qe2y1skH-wGBQgTQxgy1rpszV6r7nEypeLq2_-1Q46ZinZAYnoA-z27I3fddBzCIHwegWydNsjD1weKcPK0vcMik82q6aIKCBLHnYpZm0N6eQyL6WRxP6f9bwXUqEJQ5rl1HjcASo4iY05oHr3RNjpldHDcKIPgL0zucd2HTBorixgYwnthytw5fgIjFPxhDKT0gSkbo3UB8bXgOuZe6uC9K4PH3k9hnGysv7uEFHVrfp3MP_u_6hz22q_S07mbXcBo1fyESwTUlb1Ko_oLIUt1ug |
link.rule.ids | 228,230,783,888 |
linkProvider | Cornell University |
openUrl | ctx_ver=Z39.88-2004&ctx_enc=info%3Aofi%2Fenc%3AUTF-8&rfr_id=info%3Asid%2Fsummon.serialssolutions.com&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Ajournal&rft.genre=article&rft.atitle=Open+Domain+Knowledge+Extraction+for+Knowledge+Graphs&rft.au=Qian%2C+Kun&rft.au=Belyi%2C+Anton&rft.au=Wu%2C+Fei&rft.au=Khorshidi%2C+Samira&rft.date=2023-10-30&rft_id=info:doi/10.48550%2Farxiv.2312.09424&rft.externalDocID=2312_09424 |