Constructing and Cleaning Identity Graphs in the LOD Cloud
In the absence of a central naming authority on the Semantic Web, it is common for different data sets to refer to the same thing by different names. Whenever multiple names are used to denote the same thing, owl:sameAs statements are needed in order to link the data and foster reuse. Studies that d...
Saved in:
Published in | Data intelligence Vol. 2; no. 3; pp. 323 - 352 |
---|---|
Main Authors | , , , , , |
Format | Journal Article |
Language | English |
Published |
One Rogers Street, Cambridge, MA 02142-1209, USA
MIT Press
01.07.2020
MIT Press Journals, The |
Subjects | |
Online Access | Get full text |
ISSN | 2641-435X 2641-435X |
DOI | 10.1162/dint_a_00057 |
Cover
Abstract | In the absence of a central naming authority on the Semantic Web, it is common for different data sets to refer to the same thing by different names. Whenever multiple names are used to denote the same thing, owl:sameAs statements are needed in order to link the data and foster reuse. Studies that date back as far as 2009, observed that the owl:sameAs property is sometimes used incorrectly. In our previous work, we presented an identity graph containing over 500 million explicit and 35 billion implied owl:sameAs statements, and presented a scalable approach for automatically calculating an error degree for each identity statement. In this paper, we generate subgraphs of the overall identity graph that correspond to certain error degrees. We show that even though the Semantic Web contains many erroneous owl:sameAs statements, it is still possible to use Semantic Web data while at the same time minimising the adverse effects of misusing owl:sameAs. |
---|---|
AbstractList | In the absence of a central naming authority on the Semantic Web, it is common for different data sets to refer to the same thing by different names. Whenever multiple names are used to denote the same thing, owl:sameAs statements are needed in order to link the data and foster reuse. Studies that date back as far as 2009, observed that the owl:sameAs property is sometimes used incorrectly. In our previous work, we presented an identity graph containing over 500 million explicit and 35 billion implied owl:sameAs statements, and presented a scalable approach for automatically calculating an error degree for each identity statement. In this paper, we generate subgraphs of the overall identity graph that correspond to certain error degrees. We show that even though the Semantic Web contains many erroneous owl:sameAs statements, it is still possible to use Semantic Web data while at the same time minimising the adverse effects of misusing owl:sameAs. |
Author | Raad, Joe Pernelle, Nathalie Wielemaker, Jan van Harmelen, Frank Saïs, Fatiha Beek, Wouter |
Author_xml | – sequence: 1 givenname: Joe surname: Raad fullname: Raad, Joe email: j.raad@vu.nl organization: Deptartment of Computer Science, Vrije University, Amsterdam, The Netherlands – sequence: 2 givenname: Wouter surname: Beek fullname: Beek, Wouter organization: Deptartment of Computer Science, Vrije University, Amsterdam, The Netherlands – sequence: 3 givenname: Frank surname: van Harmelen fullname: van Harmelen, Frank organization: Deptartment of Computer Science, Vrije University, Amsterdam, The Netherlands – sequence: 4 givenname: Jan surname: Wielemaker fullname: Wielemaker, Jan organization: Deptartment of Computer Science, Vrije University, Amsterdam, The Netherlands – sequence: 5 givenname: Nathalie surname: Pernelle fullname: Pernelle, Nathalie – sequence: 6 givenname: Fatiha surname: Saïs fullname: Saïs, Fatiha |
BookMark | eNp10UFPwyAUB3BiZuKcu_kBmnjxYBUopeDFmKpzyZJddvBGSKGOZYMK1GR-elnmYRp3ApLf__F4nIOBdVYDcIngLUIU3yljo5ACQlhWJ2CIKUE5Kcq3wcH-DIxDWCWCEUWclENwXzsbou-baOx7Jq3K6rWWdneYKm2jidts4mW3DJmxWVzqbDZ_Ssb16gKctnId9PhnHYHFy_Oifs1n88m0fpzlDcEk5oqQirWogKrUulSaVpQixgraYMQgRUQSygiETCHKFcWEo5a2StK25QzzYgSu9mU77z56HaJYud7bdKPAjENOq_S8pPBeNd6F4HUrGhNlNM5GL81aICh2UxKHU0qhmz-hzpuN9Ntj_HrPN-agiSP04R-6I5_YFKKApCyQwOkfUlhALr5M97vCN18ii6Y |
CitedBy_id | crossref_primary_10_1109_ACCESS_2023_3250105 crossref_primary_10_1145_3721985 |
Cites_doi | 10.1103/PhysRevE.80.056117 10.1016/j.cnsns.2012.03.023 10.1103/PhysRevE.80.016118 10.1038/srep30750 10.1016/j.websem.2011.11.002 10.1103/PhysRevE.80.016109 10.1103/PhysRevE.69.026113 10.1016/j.phys-rep.2009.11.002 10.1103/PhysRevE.78.046110 10.1088/1742-5468/2008/10/P10008 |
ContentType | Journal Article |
Copyright | 2020. This work is published under https://creativecommons.org/licenses/by/4.0/legalcode (the “License”). Notwithstanding the ProQuest Terms and Conditions, you may use this content in accordance with the terms of the License. |
Copyright_xml | – notice: 2020. This work is published under https://creativecommons.org/licenses/by/4.0/legalcode (the “License”). Notwithstanding the ProQuest Terms and Conditions, you may use this content in accordance with the terms of the License. |
DBID | AAYXX CITATION 8FE 8FG ABJCF ABUWG AFKRA ARAPS AZQEC BENPR BGLVJ CCPQU DWQXO GNUQQ HCIFZ JQ2 K7- L6V M7S P62 PHGZM PHGZT PIMPY PKEHL PQEST PQGLB PQQKQ PQUKI PTHSS |
DOI | 10.1162/dint_a_00057 |
DatabaseName | CrossRef ProQuest SciTech Collection ProQuest Technology Collection Materials Science & Engineering Collection ProQuest Central (Alumni) ProQuest Central UK/Ireland Advanced Technologies & Aerospace Collection ProQuest Central Essentials ProQuest Central Technology Collection ProQuest One Community College ProQuest Central ProQuest Central Student SciTech Premium Collection ProQuest Computer Science Collection Computer Science Database ProQuest Engineering Collection Engineering Database ProQuest Advanced Technologies & Aerospace Collection ProQuest Central Premium ProQuest One Academic Publicly Available Content Database ProQuest One Academic Middle East (New) ProQuest One Academic Eastern Edition (DO NOT USE) ProQuest One Applied & Life Sciences ProQuest One Academic ProQuest One Academic UKI Edition Engineering Collection |
DatabaseTitle | CrossRef Publicly Available Content Database Computer Science Database ProQuest Central Student Technology Collection ProQuest One Academic Middle East (New) ProQuest Advanced Technologies & Aerospace Collection ProQuest Central Essentials ProQuest Computer Science Collection ProQuest Central (Alumni Edition) SciTech Premium Collection ProQuest One Community College ProQuest Central ProQuest One Applied & Life Sciences ProQuest Engineering Collection ProQuest Central Korea ProQuest Central (New) Engineering Collection Advanced Technologies & Aerospace Collection Engineering Database ProQuest One Academic Eastern Edition ProQuest Technology Collection ProQuest SciTech Collection ProQuest One Academic UKI Edition Materials Science & Engineering Collection ProQuest One Academic ProQuest One Academic (New) |
DatabaseTitleList | CrossRef Publicly Available Content Database |
Database_xml | – sequence: 1 dbid: 8FG name: ProQuest Technology Collection url: https://search.proquest.com/technologycollection1 sourceTypes: Aggregation Database |
DeliveryMethod | fulltext_linktorsrc |
EISSN | 2641-435X |
EndPage | 352 |
ExternalDocumentID | 10_1162_dint_a_00057 dint_a_00057.pdf |
GroupedDBID | ALMA_UNASSIGNED_HOLDINGS EBS EJD GROUPED_DOAJ LM3 OK1 RMI AAYXX ABJCF AFKRA ARAPS BENPR BGLVJ CCPQU CITATION HCIFZ JMNJE K7- M7S PHGZM PHGZT PIMPY PTHSS 8FE 8FG ABUWG AZQEC DWQXO GNUQQ JQ2 L6V P62 PKEHL PQEST PQGLB PQQKQ PQUKI |
ID | FETCH-LOGICAL-c424t-d4478f130d5ee5de676618836c2180614a4684008d169d62491f6fda6ff98293 |
IEDL.DBID | 8FG |
ISSN | 2641-435X |
IngestDate | Fri Jul 25 11:39:55 EDT 2025 Sun Jul 06 05:03:59 EDT 2025 Thu Apr 24 23:01:46 EDT 2025 Tue Mar 01 17:36:41 EST 2022 Tue Mar 01 17:18:01 EST 2022 |
IsDoiOpenAccess | true |
IsOpenAccess | true |
IsPeerReviewed | true |
IsScholarly | true |
Issue | 3 |
Language | English |
LinkModel | DirectLink |
MergedId | FETCHMERGED-LOGICAL-c424t-d4478f130d5ee5de676618836c2180614a4684008d169d62491f6fda6ff98293 |
Notes | Summer, 2020 ObjectType-Article-1 SourceType-Scholarly Journals-1 ObjectType-Feature-2 content type line 14 |
OpenAccessLink | https://www.proquest.com/docview/2890967264?pq-origsite=%requestingapplication% |
PQID | 2890967264 |
PQPubID | 6535869 |
PageCount | 30 |
ParticipantIDs | proquest_journals_2890967264 crossref_citationtrail_10_1162_dint_a_00057 mit_journals_dintv2i3_304531_2021_11_09_zip_dint_a_00057 mit_journals_10_1162_dint_a_00057 crossref_primary_10_1162_dint_a_00057 |
PublicationCentury | 2000 |
PublicationDate | 2020-07-01 |
PublicationDateYYYYMMDD | 2020-07-01 |
PublicationDate_xml | – month: 07 year: 2020 text: 2020-07-01 day: 01 |
PublicationDecade | 2020 |
PublicationPlace | One Rogers Street, Cambridge, MA 02142-1209, USA |
PublicationPlace_xml | – name: One Rogers Street, Cambridge, MA 02142-1209, USA – name: Cambridge |
PublicationTitle | Data intelligence |
PublicationYear | 2020 |
Publisher | MIT Press MIT Press Journals, The |
Publisher_xml | – name: MIT Press – name: MIT Press Journals, The |
References | ref24 ref26 ref25 ref20 ref22 ref21 ref1 ref17 ref16 ref18 |
References_xml | – ident: ref18 doi: 10.1103/PhysRevE.80.056117 – ident: ref17 doi: 10.1016/j.cnsns.2012.03.023 – ident: ref21 doi: 10.1103/PhysRevE.80.016118 – ident: ref25 doi: 10.1038/srep30750 – ident: ref1 doi: 10.1016/j.websem.2011.11.002 – ident: ref24 doi: 10.1103/PhysRevE.80.016109 – ident: ref26 doi: 10.1103/PhysRevE.69.026113 – ident: ref16 doi: 10.1016/j.phys-rep.2009.11.002 – ident: ref20 doi: 10.1103/PhysRevE.78.046110 – ident: ref22 doi: 10.1088/1742-5468/2008/10/P10008 |
SSID | ssj0002161945 |
Score | 2.1404536 |
Snippet | In the absence of a central naming authority on the Semantic Web, it is common for different data sets to refer to the same thing by different names. Whenever... |
SourceID | proquest crossref mit |
SourceType | Aggregation Database Enrichment Source Index Database Publisher |
StartPage | 323 |
SubjectTerms | Graph theory Identity Linked Open Data Names Quality Reasoning Semantic web Semantics |
Title | Constructing and Cleaning Identity Graphs in the LOD Cloud |
URI | https://direct.mit.edu/dint/article/doi/10.1162/dint_a_00057 https://www.proquest.com/docview/2890967264 |
Volume | 2 |
hasFullText | 1 |
inHoldings | 1 |
isFullTextHit | |
isPrint | |
link | http://utb.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwfV3fS8MwEA5ue_FFFBWnc0TQJylrszRLfRGd-4HoFJmwt9A2qRRmN1kV9K_3rmu3qcznXB6Su-TLd3e5I-Q08EJtN41rGcAGixsNR0pEocWlH0jfDTxu42_k-4HoP_PbkTvKHW6zPK2yuBOzi1pPQvSRNzAg5okW4Pfl9M3CrlEYXc1baJRIxQGkQTuX3d7Cx8Ic5Ohuke8uWAPwIFU-RlwRj1aQqPQap3-u4wxjuttkK38c0qu5NnfIhkl2yQX21JxXeU1eKBB_2h4bH90ZNP9l-0l7WHZ6RuOEwnuO3j3cgMzkXe-RYbczbPetvOOBFXLGU0tz3pIRwIp2jXG1ES2ATymbIgQkRu7mcyzOYkvtCE8LoE5OJCLtiyjyJAD3Piknk8QcEOoxFthChwYYBw-4BlYjZCgcraOIt5hfJefF4lWYVwPHphRjlbECwdTqVlXJ2UJ6Oq-CsUbuBPZR5cdgtkZG_pDBsQ8WNxXGbJuOYqA2mKVsT33F019Ta4WClvOXhnH4__AR2WRIlrNc2xopg-bMMbwo0qCemU2dVK47g8enbwQwytU |
linkProvider | ProQuest |
linkToHtml | http://utb.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwtV3dT8IwEG8QHvTFaNT4gVoTfTILW-lKZ2KMAoqKaAwmvDXb2hkSHBhQg_-T_6N3sIEf0Teed31Y73q__u56d4TsB16o7aJxLQPYYHGj4UiJKLS49APpu4HHbaxGvmmI2gO_armtDPlIa2HwWWXqE0eOWndDjJEXMCHmiRLg90nv2cKpUZhdTUdojM3i2gzfgLL1jy8roN8Dxs6rzXLNSqYKWCFnfGBpzksyAtetXWNcbUQJIErKoggB7ZAf-RwboNhSO8LTAuiJE4lI-yKKPMmw9xJ4_BzHgtYsyZ1VG3f3k6AOczAo4KYP7AUrAAANlI8pXgTAL9A399Qe_PL_I1A7XyKLyW2Uno7NZ5lkTLxCjnCI57itbPxI_VjTcsf4GD-hSVnvkF5gn-s-bccULpC0flsBme6LXiXNWWzGGsnG3disE-oxFthChwYoDg-4BholZCgcraOIl5i_QQ7Tn1dh0n4cp2B01IiGCKa-btUGOZhI98ZtN_6Q24N9VMm56_8hI7_J4LdX1i4qTBIXHcVAbbBK2Z56b_d-LM2nCpqun1ri5v-fd8l8rXlTV_XLxvUWWWDI1EcPffMkC1o023CdGQQ7iRFRomZstp_PiASb |
openUrl | ctx_ver=Z39.88-2004&ctx_enc=info%3Aofi%2Fenc%3AUTF-8&rfr_id=info%3Asid%2Fsummon.serialssolutions.com&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Ajournal&rft.genre=article&rft.atitle=Constructing+and+Cleaning+Identity+Graphs+in+the+LOD+Cloud&rft.jtitle=Data+intelligence&rft.au=Raad%2C+Joe&rft.au=Beek%2C+Wouter&rft.au=Frank+van+Harmelen&rft.au=Wielemaker%2C+Jan&rft.date=2020-07-01&rft.pub=MIT+Press+Journals%2C+The&rft.eissn=2641-435X&rft.volume=2&rft.issue=3&rft.spage=323&rft_id=info:doi/10.1162%2Fdint_a_00057 |
thumbnail_l | http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/lc.gif&issn=2641-435X&client=summon |
thumbnail_m | http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/mc.gif&issn=2641-435X&client=summon |
thumbnail_s | http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/sc.gif&issn=2641-435X&client=summon |