A scalable parallel Chinese online encyclopedia knowledge denoising method based on entry tags and Spark cluster

Because of the open-collaborative of online encyclopedias, a large number of knowledge triples are improperly classified in online encyclopedia systems, and it is necessary to denoise and refine the open-domain encyclopedia Knowledge Bases (KBs) to improve the quality and precision. However, the lac...

Full description

Saved in:
Bibliographic Details
Published inApplied intelligence (Dordrecht, Netherlands) Vol. 51; no. 10; pp. 7573 - 7599
Main Authors Wang, Ting, Li, Jie, Guo, Jiale
Format Journal Article
LanguageEnglish
Published New York Springer US 01.10.2021
Springer Nature B.V
Subjects
Online AccessGet full text

Cover

Loading…
Abstract Because of the open-collaborative of online encyclopedias, a large number of knowledge triples are improperly classified in online encyclopedia systems, and it is necessary to denoise and refine the open-domain encyclopedia Knowledge Bases (KBs) to improve the quality and precision. However, the lack and inaccuracy of triple semantic features lead to a poor refinement effect. In addition, considering large-scale encyclopedia KBs, the processing of massive knowledge will lead to too much computing time and poor scalability of the algorithm. To solve the problems of knowledge denoising in the Chinese encyclopedia system, first, based on data field theory, this paper proposes a new Cartesian product mapping-based method (TripleES) to calculate the semantic similarity of entity triples, based on which a method for quantifying the quality of entry tags is proposed. Second, to further improve the denoising effect on KBs, this paper proposes a new method (TriplePV) to compute the potential value of triple based on multi-feature fusion strategy to calculate the semantic distance between the “out-of-vocabulary” entry tags and embeds it into the potential function. Third, to ensure our algorithms have good scalability, the proposed denoising algorithms are implemented and optimized in parallel based on the Spark cluster-computing framework. Specifically, Spark-based TripleES (ES_Spark) and Spark-based TriplePV (PV_Spark) algorithms are proposed to calculate the semantic similarity and potential value of triples respectively. Finally, a comprehensive comparative analysis is performed on the denoising effect and time efficiency with the state-of-the-art distributed Chinese encyclopedia knowledge denoising algorithm. The experimental results on real-world datasets show that the parallel denoising algorithm proposed in this paper can improve the efficiency of knowledge denoising and the accuracy of KBs, which outperforms the state-of-the-art methods.
AbstractList Because of the open-collaborative of online encyclopedias, a large number of knowledge triples are improperly classified in online encyclopedia systems, and it is necessary to denoise and refine the open-domain encyclopedia Knowledge Bases (KBs) to improve the quality and precision. However, the lack and inaccuracy of triple semantic features lead to a poor refinement effect. In addition, considering large-scale encyclopedia KBs, the processing of massive knowledge will lead to too much computing time and poor scalability of the algorithm. To solve the problems of knowledge denoising in the Chinese encyclopedia system, first, based on data field theory, this paper proposes a new Cartesian product mapping-based method (TripleES) to calculate the semantic similarity of entity triples, based on which a method for quantifying the quality of entry tags is proposed. Second, to further improve the denoising effect on KBs, this paper proposes a new method (TriplePV) to compute the potential value of triple based on multi-feature fusion strategy to calculate the semantic distance between the “out-of-vocabulary” entry tags and embeds it into the potential function. Third, to ensure our algorithms have good scalability, the proposed denoising algorithms are implemented and optimized in parallel based on the Spark cluster-computing framework. Specifically, Spark-based TripleES (ES_Spark) and Spark-based TriplePV (PV_Spark) algorithms are proposed to calculate the semantic similarity and potential value of triples respectively. Finally, a comprehensive comparative analysis is performed on the denoising effect and time efficiency with the state-of-the-art distributed Chinese encyclopedia knowledge denoising algorithm. The experimental results on real-world datasets show that the parallel denoising algorithm proposed in this paper can improve the efficiency of knowledge denoising and the accuracy of KBs, which outperforms the state-of-the-art methods.
Author Guo, Jiale
Wang, Ting
Li, Jie
Author_xml – sequence: 1
  givenname: Ting
  orcidid: 0000-0003-2481-2890
  surname: Wang
  fullname: Wang, Ting
  email: wangting@cueb.edu.cn
  organization: School of Management and Engineering, Capital University of Economics and Business
– sequence: 2
  givenname: Jie
  surname: Li
  fullname: Li, Jie
  organization: School of Software, Beijing Jiaotong University
– sequence: 3
  givenname: Jiale
  surname: Guo
  fullname: Guo, Jiale
  organization: School of Management and Engineering, Capital University of Economics and Business
BookMark eNp9kE1LAzEQhoMoWKt_wFPA82qS_cjmKMUvKHhQwVuYJrPttjFZky3Sf29sBW8ehpnD-7wDzxk59sEjIZecXXPG5E3irGpVwQTPI1Rd1EdkwmtZFrJS8phMmBJV0TTq_ZScpbRmjJUl4xMy3NJkwMHCIR0ggnPo6GzVe0xIg3f5oOjNzrgwoO2Bbnz4cmiXSC360KfeL-kHjqtg6QIS2gxlYIw7OsIyUfCWvuTiDTVum0aM5-SkA5fw4ndPydv93evssZg_PzzNbueFKbkaC8Eq03ZSiqblUnYCasm6halNUyoA3oFlqjLCIlqBBiyWrWkVSAVd2_KmKafk6tA7xPC5xTTqddhGn19qUTeKc6H2KXFImRhSitjpIfYfEHeaM_1jVh_M6mxW783qOkPlAUo57JcY_6r_ob4BVBN_dw
CitedBy_id crossref_primary_10_1007_s10489_022_04373_8
crossref_primary_10_1007_s10489_022_04448_6
crossref_primary_10_1007_s10489_022_03837_1
Cites_doi 10.1609/aaai.v34i03.5684
10.1609/aaai.v28i1.8870
10.1007/s00778-015-0415-0
10.1145/2396761.2398406
10.1016/j.eswa.2020.113889
10.4018/IJSWIS.2019040102
10.1016/j.dss.2020.113346
10.1016/j.procs.2016.07.081
10.1609/aaai.v30i1.10314
10.1145/3269206.3271804
10.1609/aaai.v25i1.7917
10.1609/aaai.v33i01.3301297
10.1016/j.knosys.2016.08.015
10.1109/ACCESS.2019.2934747
10.1609/aaai.v30i1.10329
10.1007/978-3-319-15615-6_17
10.1145/2187836.2187899
10.14778/2535570.2488333
10.17632/wz6zmvjzb3.1
10.1186/s13638-018-1318-8
10.1109/BigData.2015.7363924
10.1109/CBD.2018.00010
10.1109/TKDE.2011.103
10.1007/s00521-018-3806-5
10.1201/9781315366951
10.1631/jzus.C1101008
10.1609/aaai.v34i03.5694
10.3233/SW-140134
10.1145/2588555.2610511
10.14778/2977797.2977806
10.1016/j.eswa.2016.09.009
10.1007/978-3-642-25093-4_14
10.1609/aaai.v29i1.9491
10.1007/978-3-319-10840-7_7
10.1007/978-3-642-04930-9_41
10.1016/j.jbi.2020.103435
10.1007/978-3-319-21042-1_27
10.1007/s11704-019-8264-4
10.3115/v1/P15-1067
ContentType Journal Article
Copyright The Author(s), under exclusive licence to Springer Science+Business Media, LLC, part of Springer Nature 2021
The Author(s), under exclusive licence to Springer Science+Business Media, LLC, part of Springer Nature 2021.
Copyright_xml – notice: The Author(s), under exclusive licence to Springer Science+Business Media, LLC, part of Springer Nature 2021
– notice: The Author(s), under exclusive licence to Springer Science+Business Media, LLC, part of Springer Nature 2021.
DBID AAYXX
CITATION
3V.
7SC
7WY
7WZ
7XB
87Z
8AL
8FD
8FE
8FG
8FK
8FL
ABJCF
ABUWG
AFKRA
ARAPS
AZQEC
BENPR
BEZIV
BGLVJ
CCPQU
DWQXO
FRNLG
F~G
GNUQQ
HCIFZ
JQ2
K60
K6~
K7-
L.-
L6V
L7M
L~C
L~D
M0C
M0N
M7S
P5Z
P62
PQBIZ
PQBZA
PQEST
PQQKQ
PQUKI
PRINS
PSYQQ
PTHSS
Q9U
DOI 10.1007/s10489-021-02295-5
DatabaseName CrossRef
ProQuest Central (Corporate)
Computer and Information Systems Abstracts
ABI-INFORM Complete
ABI/INFORM Global (PDF only)
ProQuest Central (purchase pre-March 2016)
ABI/INFORM Collection
Computing Database (Alumni Edition)
Technology Research Database
ProQuest SciTech Collection
ProQuest Technology Collection
ProQuest Central (Alumni) (purchase pre-March 2016)
ABI/INFORM Collection (Alumni Edition)
Materials Science & Engineering Collection
ProQuest Central (Alumni)
ProQuest Central
Advanced Technologies & Aerospace Database‎ (1962 - current)
ProQuest Central Essentials
AUTh Library subscriptions: ProQuest Central
ProQuest Business Premium Collection
Technology Collection
ProQuest One Community College
ProQuest Central
Business Premium Collection (Alumni)
ABI/INFORM Global (Corporate)
ProQuest Central Student
SciTech Premium Collection (Proquest) (PQ_SDU_P3)
ProQuest Computer Science Collection
ProQuest Business Collection (Alumni Edition)
ProQuest Business Collection
Computer Science Database
ABI/INFORM Professional Advanced
ProQuest Engineering Collection
Advanced Technologies Database with Aerospace
Computer and Information Systems Abstracts – Academic
Computer and Information Systems Abstracts Professional
ABI/INFORM Global (ProQuest)
Computing Database
Engineering Database
Advanced Technologies & Aerospace Database
ProQuest Advanced Technologies & Aerospace Collection
ProQuest One Business
ProQuest One Business (Alumni)
ProQuest One Academic Eastern Edition (DO NOT USE)
ProQuest One Academic
ProQuest One Academic UKI Edition
ProQuest Central China
ProQuest One Psychology
Engineering Collection
ProQuest Central Basic
DatabaseTitle CrossRef
ABI/INFORM Global (Corporate)
ProQuest Business Collection (Alumni Edition)
ProQuest One Business
ProQuest One Psychology
Computer Science Database
ProQuest Central Student
Technology Collection
Technology Research Database
Computer and Information Systems Abstracts – Academic
ProQuest Advanced Technologies & Aerospace Collection
ProQuest Central Essentials
ProQuest Computer Science Collection
Computer and Information Systems Abstracts
ProQuest Central (Alumni Edition)
SciTech Premium Collection
ProQuest One Community College
ProQuest Central China
ABI/INFORM Complete
ProQuest Central
ABI/INFORM Professional Advanced
ProQuest Engineering Collection
ProQuest Central Korea
Advanced Technologies Database with Aerospace
ABI/INFORM Complete (Alumni Edition)
Engineering Collection
Advanced Technologies & Aerospace Collection
Business Premium Collection
ABI/INFORM Global
ProQuest Computing
Engineering Database
ABI/INFORM Global (Alumni Edition)
ProQuest Central Basic
ProQuest Computing (Alumni Edition)
ProQuest One Academic Eastern Edition
ProQuest Technology Collection
ProQuest SciTech Collection
ProQuest Business Collection
Computer and Information Systems Abstracts Professional
Advanced Technologies & Aerospace Database
ProQuest One Academic UKI Edition
Materials Science & Engineering Collection
ProQuest One Business (Alumni)
ProQuest One Academic
ProQuest Central (Alumni)
Business Premium Collection (Alumni)
DatabaseTitleList ABI/INFORM Global (Corporate)

Database_xml – sequence: 1
  dbid: 8FG
  name: ProQuest Technology Collection
  url: https://search.proquest.com/technologycollection1
  sourceTypes: Aggregation Database
DeliveryMethod fulltext_linktorsrc
Discipline Computer Science
EISSN 1573-7497
EndPage 7599
ExternalDocumentID 10_1007_s10489_021_02295_5
GroupedDBID -4Z
-59
-5G
-BR
-EM
-Y2
-~C
-~X
.86
.DC
.VR
06D
0R~
0VY
1N0
1SB
2.D
203
23M
28-
2J2
2JN
2JY
2KG
2LR
2P1
2VQ
2~H
30V
3V.
4.4
406
408
409
40D
40E
5GY
5QI
5VS
67Z
6NX
77K
7WY
8FE
8FG
8FL
8TC
8UJ
95-
95.
95~
96X
AAAVM
AABHQ
AABYN
AAFGU
AAHNG
AAIAL
AAJKR
AANZL
AAOBN
AAPBV
AARHV
AARTL
AATNV
AATVU
AAUYE
AAWCG
AAWWR
AAYFA
AAYIU
AAYQN
AAYTO
ABBBX
ABBXA
ABDZT
ABECU
ABFGW
ABFTV
ABHLI
ABHQN
ABIVO
ABJCF
ABJNI
ABJOX
ABKAS
ABKCH
ABKTR
ABMNI
ABMQK
ABNWP
ABQBU
ABSXP
ABTAH
ABTEG
ABTHY
ABTKH
ABTMW
ABULA
ABUWG
ABWNU
ABXPI
ACBMV
ACBRV
ACBXY
ACBYP
ACGFS
ACHSB
ACHXU
ACIGE
ACIPQ
ACIWK
ACKNC
ACMDZ
ACMLO
ACOKC
ACOMO
ACSNA
ACTTH
ACVWB
ACWMK
ADGRI
ADHHG
ADHIR
ADIMF
ADINQ
ADKNI
ADKPE
ADMDM
ADOXG
ADRFC
ADTPH
ADURQ
ADYFF
ADZKW
AEBTG
AEEQQ
AEFIE
AEFTE
AEGAL
AEGNC
AEJHL
AEJRE
AEKMD
AENEX
AEOHA
AEPYU
AESKC
AESTI
AETLH
AEVLU
AEVTX
AEXYK
AEYWE
AFEXP
AFGCZ
AFKRA
AFLOW
AFNRJ
AFQWF
AFWTZ
AFZKB
AGAYW
AGDGC
AGGBP
AGGDS
AGJBK
AGMZJ
AGQMX
AGWIL
AGWZB
AGYKE
AHAVH
AHBYD
AHKAY
AHSBF
AHYZX
AIAKS
AIIXL
AILAN
AIMYW
AITGF
AJBLW
AJDOV
AJRNO
AJZVZ
AKQUC
ALMA_UNASSIGNED_HOLDINGS
ALWAN
AMKLP
AMXSW
AMYLF
AMYQR
AOCGG
ARAPS
ARMRJ
ASPBG
AVWKF
AXYYD
AYJHY
AZFZN
AZQEC
B-.
BA0
BBWZM
BDATZ
BENPR
BEZIV
BGLVJ
BGNMA
BPHCQ
CAG
CCPQU
COF
CS3
CSCUP
DDRTE
DL5
DNIVK
DPUIP
DWQXO
EBLON
EBS
EIOEI
EJD
ESBYG
FEDTE
FERAY
FFXSO
FIGPU
FINBP
FNLPD
FRNLG
FRRFC
FSGXE
FWDCC
GGCAI
GGRSB
GJIRD
GNUQQ
GNWQR
GQ6
GQ7
GQ8
GROUPED_ABI_INFORM_COMPLETE
GXS
HCIFZ
HF~
HG5
HG6
HMJXF
HQYDN
HRMNR
HVGLF
HZ~
I09
IHE
IJ-
IKXTQ
ITM
IWAJR
IXC
IZIGR
IZQ
I~X
I~Z
J-C
J0Z
JBSCW
JCJTX
JZLTJ
K60
K6V
K6~
K7-
KDC
KOV
KOW
L6V
LAK
LLZTM
M0C
M0N
M4Y
M7S
MA-
N2Q
N9A
NB0
NDZJH
NPVJJ
NQJWS
NU0
O9-
O93
O9G
O9I
O9J
OAM
OVD
P19
P2P
P62
P9O
PF0
PQBIZ
PQQKQ
PROAC
PSYQQ
PT4
PT5
PTHSS
Q2X
QOK
QOS
R4E
R89
R9I
RHV
RNI
RNS
ROL
RPX
RSV
RZC
RZE
RZK
S16
S1Z
S26
S27
S28
S3B
SAP
SCJ
SCLPG
SCO
SDH
SDM
SHX
SISQX
SJYHP
SNE
SNPRN
SNX
SOHCF
SOJ
SPISZ
SRMVM
SSLCW
STPWE
SZN
T13
T16
TEORI
TSG
TSK
TSV
TUC
U2A
UG4
UNUBA
UOJIU
UTJUX
UZXMN
VC2
VFIZW
W23
W48
WK8
YLTOR
Z45
Z5O
Z7R
Z7S
Z7X
Z7Y
Z7Z
Z81
Z83
Z86
Z87
Z88
Z8M
Z8N
Z8R
Z8S
Z8T
Z8U
Z8W
Z91
Z92
ZMTXR
ZY4
~A9
~EX
AACDK
AAEOY
AAJBT
AASML
AAYXX
ABAKF
ACAOD
ACDTI
ACZOJ
AEFQL
AEMSY
AFBBN
AGQEE
AGRTI
AIGIU
CITATION
H13
PQBZA
7SC
7XB
8AL
8FD
8FK
JQ2
L.-
L7M
L~C
L~D
PQEST
PQUKI
PRINS
Q9U
ID FETCH-LOGICAL-c319t-204c8f77268177f2a570fbc5c639aa1fad094c2deed2ecade38c89a79af881663
IEDL.DBID 8FG
ISSN 0924-669X
IngestDate Fri Sep 13 02:05:47 EDT 2024
Thu Sep 12 16:55:12 EDT 2024
Sat Dec 16 12:09:04 EST 2023
IsPeerReviewed true
IsScholarly true
Issue 10
Keywords Parallel computing
Knowledge base
Potential function
Semantic distance
Knowledge denoising
Chinese online encyclopedia
Language English
LinkModel DirectLink
MergedId FETCHMERGED-LOGICAL-c319t-204c8f77268177f2a570fbc5c639aa1fad094c2deed2ecade38c89a79af881663
ORCID 0000-0003-2481-2890
PQID 2569112966
PQPubID 326365
PageCount 27
ParticipantIDs proquest_journals_2569112966
crossref_primary_10_1007_s10489_021_02295_5
springer_journals_10_1007_s10489_021_02295_5
PublicationCentury 2000
PublicationDate 2021-10-01
PublicationDateYYYYMMDD 2021-10-01
PublicationDate_xml – month: 10
  year: 2021
  text: 2021-10-01
  day: 01
PublicationDecade 2020
PublicationPlace New York
PublicationPlace_xml – name: New York
– name: Boston
PublicationSubtitle The International Journal of Research on Intelligent Systems for Real Life Complex Problems
PublicationTitle Applied intelligence (Dordrecht, Netherlands)
PublicationTitleAbbrev Appl Intell
PublicationYear 2021
Publisher Springer US
Springer Nature B.V
Publisher_xml – name: Springer US
– name: Springer Nature B.V
References WangZWangZLiJPanJZKnowledge extraction from chinese wiki encyclopediasJ Zhejiang Univ Sci C201213426828010.1631/jzus.C1101008
Torre-Bastida AI, Villar-Rodriguez E, Del Ser J, Camacho D, Gonzalez-Rodriguez M (2014) On interlinking linked data sources by using ontology matching techniques and the map-reduce framework. In: International Conference on Intelligent Data Engineering and Automated Learning, pp 53–60
Bordes A, Usunier N, Garcia-Duran A, Weston J, Yakhnenko O (2013) Translating embeddings for modeling multi-relational data. In: Neural Information Processing Systems (NIPS), pp 2787–2795
Wang Z, Zhang J, Feng J, Chen Z (2014) Knowledge graph embedding by translating on hyperplanes. In: Proceedings of the AAAI Conference on Artificial Intelligence, vol 28, pp 1112–1119
WuTQiGLuoBZhangLWangHLanguage-independent type inference of the instances from multilingual wikipediaInt J Semant Web Inf Syst (IJSWIS)2019152224610.4018/IJSWIS.2019040102
LehmannJIseleRJakobMJentzschAKontokostasDMendesPNHellmannSMorseyMvan kleef, PAuerSDbpedia–a large-scale, multilingual knowledge base extracted from wikipediaSemantic web20156216719510.3233/SW-140134
HusainMMcGlothlinJMasudMMKhanLThuraisinghamBMHeuristics-based query processing for large rdf graphs using cloud computingIEEE Trans Knowl Data Eng20112391312132710.1109/TKDE.2011.103
Wang T, Li J, Guo J, Xie J (2019) A novel large-scale chinese encyclopedia knowledge parallel refining method based on mapreduce. IEEE Access 7:111840–111857
NizzoliLAvvenutiMTesconiMCresciSGeo-semantic-parsing: Ai-powered geoparsing by traversing semantic knowledge graphsDecis Support Syst202013611610.1016/j.dss.2020.113346
Ahn J, Im D, Eom J, Zong N, Kim H (2014) G-diff: a grouping algorithm for rdf change detection on mapreduce. In: Joint International Semantic Technology Conference, pp 230–235
Socher R, Chen D, Manning CD, Ng A (2013) Reasoning with neural tensor networks for knowledge base completion. In: Advances in neural information processing systems, pp 926–934
Li D, Du Y (2017) Artificial intelligence with uncertainty, 2nd edn. CRC Press, Boca Raton
Pershina M, Yakout M, Chakrabarti K (2015) Holistic entity matching across knowledge graphs. In: 2015 IEEE International Conference on Big Data (Big Data), pp 1585–1590
Wang Z, Li J, Wang Z, Tang J (2012) Cross-lingual knowledge linking across wiki knowledge bases. In: Proceedings of the 21st international conference on World Wide Web, pp 459–468
Trisedya BD, Qi J, Zhang R (2019) Entity alignment between knowledge graphs using attribute embeddings. In: Proceedings of the AAAI Conference on Artificial Intelligence, vol 33, pp 297–304
PengPZouLÖzsuMTChenLZhaoDProcessing sparql queries over distributed rdf graphsThe VLDB J201625224326810.1007/s00778-015-0415-0
LeeTImD-HWonJSimilarity-based change detection for rdf in mapreduceProcedia Comput Sci20169178979710.1016/j.procs.2016.07.081
Wang T Knowledge base for baidubaike. https://doi.org/10.17632/wz6zmvjzb3.1
WangTGuHWuZGaoJMulti-source knowledge integration based on machine learning algorithms for domain ontologyNeural Comput Appl202032123524510.1007/s00521-018-3806-5
Gurajada S, Seufert S, Miliaraki I, Theobald M (2014) Triad: a distributed shared-nothing rdf engine based on asynchronous message passing. In: Proceedings of the 2014 ACM SIGMOD international conference on Management of data, pp 289–300
Wang Z, Li J, Liu Z, Tang J (2016) Text-enhanced representation learning for knowledge graph. In: Proceedings of International Joint Conference on Artificial Intelligent (IJCAI), pp 1293–1299
Xie R, Liu Z, Jia J, Luan H, Sun M (2016) Representation learning of knowledge graphs with entity descriptions. In: Proceedings of the AAAI Conference on Artificial Intelligence, vol 30, pp 2659–2665
SchätzleAPrzyjaciel-ZablockiMSkilevicSLausenGS2rdf: Rdf querying with sparql on sparkProc VLDB Endow201691080481510.14778/2977797.2977806https://doi.org/10.14778/2977797.2977806
Wang Z, Li J, Tang J (2013) Boosting cross-lingual knowledge linking via concept annotation. In: Twenty-Third International Joint Conference on Artificial Intelligence, pp 2733–2739
Lin Y, Liu Z, Sun M, Liu Y, Zhu X (2015) Learning entity and relation embeddings for knowledge graph completion. In: Proceedings of the AAAI Conference on Artificial Intelligence, vol 29, pp 2181–2187
Niu X, Sun X, Wang H, Rong S, Qi G, Yu Y (2011) Zhishi. me-weaving chinese linking open data. In: International Semantic Web Conference, pp 205–220
XuJZhangCSemantic connection set-based massive rdf data query processing in spark environmentEURASIP J Wirel Commun Netw20192019111010.1186/s13638-018-1318-8
Li Y, Du G, Xiang Y, Li S, Ma L, Shao D, Wang X, Chen H (2020) Towards chinese clinical named entity recognition by dynamic embedding using domain-specific knowledge. J Biomed Inf:1–9
Khadilkar V, Kantarcioglu M, Thuraisingham B, Castagna P (2012) Jena-hbase: A distributed, scalable and efficient rdf triple store. In: Proceedings of the 11th International Semantic Web Conference Posters & Demonstrations Track, ISWC-PD, Vol 12, pp 85–88
GuRWangSGuoCYuanCHuangYLarge scale semantic rule-based backward chaining reasoning on sparkJ Chin Inf Process2018323120134
WangXXuQChaiLYangYChaiYEfficient distributed query processing on large scale rdf graph dataRuan Jian Xue Bao/J Softw20193034985141438.68046
Malaviya C, Bhagavatula C, Bosselut A, Choi Y (2020) Commonsense knowledge base completion with structural and semantic context. In: Proceedings of the AAAI Conference on Artificial Intelligence, vol . 34, pp 2925–2933
Volz J, Bizer C, Gaedke M, Kobilarov G (2009) Discovering and maintaining links on the web of data Hutchison D, Kanade T, Kittler J, Kleinberg JM, Mattern F, Mitchell JC, Naor M, Nierstrasz O, Pandu Rangan C, Steffen B, Sudan M, Terzopoulos D, Tygar D, Vardi MY, Weikum G, Bernstein A, Karger DR, Heath T, Feigenbaum L, Maynard D, Motta E, Thirunarayan K (eds), Springer, Berlin. https://doi.org/10.1007/978-3-642-04930-9_41
Bordes A, Weston J, Collobert R, Bengio Y (2011) Learning structured embeddings of knowledge bases. In: Proceedings of the AAAI Conference on Artificial Intelligence, vol 25, pp 301–306
LiuFShenYZhangTGaoHEntity-related paths modeling for knowledge base completionFront Comput Sci2020145110https://doi.org/10.1007/s11704-019-8264-4
ChenTLiuWZhuQSinopedia: an unified chinese terminology service platform based on linked dataJ Libr Sci China2018444418
WangYWuCTsaiRTCross-language article linking with different knowledge bases using bilingual topic model and translation featuresKnowl-Based Syst201611122823610.1016/j.knosys.2016.08.015
ChenXJiaSDingLShenHXiangYSdt: an integrated model for open-world knowledge graph reasoningExpert Syst Appl20201621910.1016/j.eswa.2020.113889https://doi.org/10.1016/j.eswa.2020.113889
Xu B, Luo Z, Huang L, Liang B, Xiao Y, Yang D, Wang W (2018) Metic: Multi-instance entity typing from corpus. In: Proceedings of the 27th ACM International Conference on Information and Knowledge Management, pp 903–912
Niu X, Rong S, Wang H, Yu Y (2012) An effective rule miner for instance matching in a web of data. In: Proceedings of the 21st ACM international conference on Information and knowledge management, pp 1085–1094
ChenKZhangZLongJZhangHTurning from tf-idf to tf-igm for term weighting in text classificationExpert Syst Appl20166624526010.1016/j.eswa.2016.09.009
Xiong Z, Zhu G, Yu W, Wang S, Chong Z (2018) Load-balanced cluster for scale-out storage of knowledge. In: 2018 Sixth International Conference on Advanced Cloud and Big Data (CBD), pp 1–5. https://doi.org/10.1109/CBD.2018.00010
Ji G, He S, Xu L, Liu K, Zhao J (2015) Knowledge graph embedding via dynamic mapping matrix. In: Proceedings of the 53rd annual meeting of the association for computational linguistics and the 7th international joint conference on natural language processing (volume 1: Long papers), pp 687–696
Nickel M, Rosasco L, Poggio T (2016) Holographic embeddings of knowledge graphs. In: Proceedings of the AAAI Conference on Artificial Intelligence, vol 30, pp 1955–1961
Vashishth S, Sanyal S, Nitin V, Agrawal N, Talukdar P (2020) Interacte: Improving convolution-based knowledge graph embeddings by increasing feature interactions. In: Proceedings of the AAAI Conference on Artificial Intelligence, vol 34, pp 3009–3016
ZengKYangJWangHShaoBWangZA distributed graph engine for web scale rdf dataProc VLDB Endow20136426527610.14778/2535570.2488333
Xu Z, Chen W, Gai L, Wang T (2015) Sparkrdf: In-memory distributed rdf management framework for large-scale social data. In: International Conference on Web-Age Information Management, pp 337–349
T Wu (2295_CR31) 2019; 15
X Chen (2295_CR27) 2020; 162
2295_CR7
K Zeng (2295_CR34) 2013; 6
2295_CR6
2295_CR41
T Chen (2295_CR9) 2018; 44
M Husain (2295_CR33) 2011; 23
2295_CR26
2295_CR24
2295_CR46
2295_CR25
2295_CR47
F Liu (2295_CR5) 2020; 14
2295_CR22
A Schätzle (2295_CR40) 2016; 9
2295_CR44
2295_CR23
2295_CR20
2295_CR42
2295_CR21
P Peng (2295_CR37) 2016; 25
L Nizzoli (2295_CR28) 2020; 136
T Wang (2295_CR3) 2020; 32
2295_CR29
K Chen (2295_CR4) 2016; 66
Y Wang (2295_CR14) 2016; 111
2295_CR30
X Wang (2295_CR38) 2019; 30
2295_CR15
2295_CR16
2295_CR1
2295_CR13
2295_CR35
J Xu (2295_CR39) 2019; 2019
2295_CR36
2295_CR11
J Lehmann (2295_CR2) 2015; 6
2295_CR12
2295_CR10
2295_CR32
R Gu (2295_CR43) 2018; 32
Z Wang (2295_CR8) 2012; 13
2295_CR19
2295_CR17
2295_CR18
T Lee (2295_CR45) 2016; 91
References_xml – ident: 2295_CR21
  doi: 10.1609/aaai.v34i03.5684
– ident: 2295_CR16
  doi: 10.1609/aaai.v28i1.8870
– volume: 25
  start-page: 243
  issue: 2
  year: 2016
  ident: 2295_CR37
  publication-title: The VLDB J
  doi: 10.1007/s00778-015-0415-0
  contributor:
    fullname: P Peng
– ident: 2295_CR10
  doi: 10.1145/2396761.2398406
– volume: 162
  start-page: 1
  year: 2020
  ident: 2295_CR27
  publication-title: Expert Syst Appl
  doi: 10.1016/j.eswa.2020.113889
  contributor:
    fullname: X Chen
– volume: 15
  start-page: 22
  issue: 2
  year: 2019
  ident: 2295_CR31
  publication-title: Int J Semant Web Inf Syst (IJSWIS)
  doi: 10.4018/IJSWIS.2019040102
  contributor:
    fullname: T Wu
– volume: 136
  start-page: 1
  year: 2020
  ident: 2295_CR28
  publication-title: Decis Support Syst
  doi: 10.1016/j.dss.2020.113346
  contributor:
    fullname: L Nizzoli
– volume: 91
  start-page: 789
  year: 2016
  ident: 2295_CR45
  publication-title: Procedia Comput Sci
  doi: 10.1016/j.procs.2016.07.081
  contributor:
    fullname: T Lee
– ident: 2295_CR26
  doi: 10.1609/aaai.v30i1.10314
– ident: 2295_CR30
  doi: 10.1145/3269206.3271804
– ident: 2295_CR23
  doi: 10.1609/aaai.v25i1.7917
– ident: 2295_CR32
– ident: 2295_CR13
– ident: 2295_CR20
  doi: 10.1609/aaai.v33i01.3301297
– volume: 111
  start-page: 228
  year: 2016
  ident: 2295_CR14
  publication-title: Knowl-Based Syst
  doi: 10.1016/j.knosys.2016.08.015
  contributor:
    fullname: Y Wang
– ident: 2295_CR1
  doi: 10.1109/ACCESS.2019.2934747
– ident: 2295_CR19
  doi: 10.1609/aaai.v30i1.10329
– ident: 2295_CR44
  doi: 10.1007/978-3-319-15615-6_17
– ident: 2295_CR12
  doi: 10.1145/2187836.2187899
– volume: 6
  start-page: 265
  issue: 4
  year: 2013
  ident: 2295_CR34
  publication-title: Proc VLDB Endow
  doi: 10.14778/2535570.2488333
  contributor:
    fullname: K Zeng
– ident: 2295_CR47
  doi: 10.17632/wz6zmvjzb3.1
– volume: 2019
  start-page: 1
  issue: 1
  year: 2019
  ident: 2295_CR39
  publication-title: EURASIP J Wirel Commun Netw
  doi: 10.1186/s13638-018-1318-8
  contributor:
    fullname: J Xu
– ident: 2295_CR11
  doi: 10.1109/BigData.2015.7363924
– ident: 2295_CR41
  doi: 10.1109/CBD.2018.00010
– volume: 32
  start-page: 120
  issue: 3
  year: 2018
  ident: 2295_CR43
  publication-title: J Chin Inf Process
  contributor:
    fullname: R Gu
– ident: 2295_CR25
– volume: 23
  start-page: 1312
  issue: 9
  year: 2011
  ident: 2295_CR33
  publication-title: IEEE Trans Knowl Data Eng
  doi: 10.1109/TKDE.2011.103
  contributor:
    fullname: M Husain
– volume: 32
  start-page: 235
  issue: 1
  year: 2020
  ident: 2295_CR3
  publication-title: Neural Comput Appl
  doi: 10.1007/s00521-018-3806-5
  contributor:
    fullname: T Wang
– ident: 2295_CR46
  doi: 10.1201/9781315366951
– volume: 13
  start-page: 268
  issue: 4
  year: 2012
  ident: 2295_CR8
  publication-title: J Zhejiang Univ Sci C
  doi: 10.1631/jzus.C1101008
  contributor:
    fullname: Z Wang
– ident: 2295_CR22
  doi: 10.1609/aaai.v34i03.5694
– volume: 6
  start-page: 167
  issue: 2
  year: 2015
  ident: 2295_CR2
  publication-title: Semantic web
  doi: 10.3233/SW-140134
  contributor:
    fullname: J Lehmann
– ident: 2295_CR35
  doi: 10.1145/2588555.2610511
– volume: 30
  start-page: 498
  issue: 3
  year: 2019
  ident: 2295_CR38
  publication-title: Ruan Jian Xue Bao/J Softw
  contributor:
    fullname: X Wang
– volume: 9
  start-page: 804
  issue: 10
  year: 2016
  ident: 2295_CR40
  publication-title: Proc VLDB Endow
  doi: 10.14778/2977797.2977806
  contributor:
    fullname: A Schätzle
– ident: 2295_CR15
– volume: 66
  start-page: 245
  year: 2016
  ident: 2295_CR4
  publication-title: Expert Syst Appl
  doi: 10.1016/j.eswa.2016.09.009
  contributor:
    fullname: K Chen
– ident: 2295_CR7
  doi: 10.1007/978-3-642-25093-4_14
– ident: 2295_CR17
  doi: 10.1609/aaai.v29i1.9491
– ident: 2295_CR42
  doi: 10.1007/978-3-319-10840-7_7
– ident: 2295_CR6
  doi: 10.1007/978-3-642-04930-9_41
– ident: 2295_CR29
  doi: 10.1016/j.jbi.2020.103435
– volume: 44
  start-page: 4
  issue: 4
  year: 2018
  ident: 2295_CR9
  publication-title: J Libr Sci China
  contributor:
    fullname: T Chen
– ident: 2295_CR36
  doi: 10.1007/978-3-319-21042-1_27
– volume: 14
  start-page: 1
  issue: 5
  year: 2020
  ident: 2295_CR5
  publication-title: Front Comput Sci
  doi: 10.1007/s11704-019-8264-4
  contributor:
    fullname: F Liu
– ident: 2295_CR18
  doi: 10.3115/v1/P15-1067
– ident: 2295_CR24
SSID ssj0003301
Score 2.3310199
Snippet Because of the open-collaborative of online encyclopedias, a large number of knowledge triples are improperly classified in online encyclopedia systems, and it...
SourceID proquest
crossref
springer
SourceType Aggregation Database
Publisher
StartPage 7573
SubjectTerms Algorithms
Artificial Intelligence
Cartesian coordinates
Clusters
Computer Science
Computing time
Encyclopedias
Field theory
Knowledge
Knowledge bases (artificial intelligence)
Machines
Manufacturing
Mathematical analysis
Mechanical Engineering
Noise reduction
Processes
Semantics
Similarity
Tags
SummonAdditionalLinks – databaseName: SpringerLink Journals (ICM)
  dbid: U2A
  link: http://utb.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwlV07T8MwELagLCy8EYWCbmCDSI0TO85YIaoKCRao1C1ybAchorQi6f_nLo8GEAwsWZLccPb5vvM9PsauueAiSKXwxhk-aACWlyrlvCBOlRWhlr6jjO7jk5zNw4eFWPR93HWxe5eRrA_qL71uIVX3cIx-iYLaE9tsh8ADbeU5n2yOXwzQa5o8DCw8KeNF2ynzu4zv3qiHmD-yorWzmR6wvRYlwqRZ1kO25Yojtt8xMEBrkMdsNYESlUztT0BDvPPc5UCU2K500AzBAOqtNPlyRS0isLlCAzxvlm90UQANiTSQP7P4E9RcI1Dp1xJ0YeEZBb-Dydc0UuGEzaf3L3czr-VQ8AwaV4VGEBqVIYSWyo-ijGsRjbPUCIPIRGs_0xbjO8MtukruqCI_UEbFOop1piilGJyyQbEs3BkDPcbVU1KkLpMhStBhYH0nhA2N5ZmUQ3bT6TJZNaMykn4oMmk-Qc0nteYTMWSjTt1JazZlgvgrJgBIwm67Jehf_y3t_H-fX7BdTrugLsobsUH1sXaXCC6q9KreTJ9YmcbC
  priority: 102
  providerName: Springer Nature
Title A scalable parallel Chinese online encyclopedia knowledge denoising method based on entry tags and Spark cluster
URI https://link.springer.com/article/10.1007/s10489-021-02295-5
https://www.proquest.com/docview/2569112966/abstract/
Volume 51
hasFullText 1
inHoldings 1
isFullTextHit
isPrint
link http://utb.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwfV1LT9tAEB7xuHCBFooIUDSH3oqVeO3drE9VUiVEVEQVNFI4WevdNUJYSSDO_2fGsYlAKhfvwfIcZjzPnZkP4IeQQkaZkkEnpwcvwAoyrX0QJZl2MjYq9HyjezNWo0l8PZXTLRg1szDcVtnYxMpQu7nlGnmbXHPCsYFSbZNxFcCW7V-L54Dxo_ietQbT2IbdkHfi8cz48OrNJlPWXmHnUbYRKJVM6_GZeogu5rYhQWk1Y1sH8r2L2sSdH65KKw80_AL7deiIvbWsv8KWnx3CQQPLgLWWHsGih0viPM9EIW_2LgpfIONk-6XH9WYM5IFLW8wXPDeCb3U1JCM0f-TqAa6RpZGdnKOPsAIgwdI8LNHMHN4R4Se0xYr3LHyDyXDw7_coqIEVAksaV5JmxFbnFFcrHXa7uTCy28kzKy2FK8aEuXGU9FnhyH8Kz236kbY6Md3E5JrvGaNj2JnNZ_4E0HRIpFrJzOcqJgomjlzopXSxdSJXqgU_G16mi_X-jHSzKZk5nxLn04rzqWzBecPutNalZbqRfAsuGxFsXv-f2unn1M5gT7DUq868c9gpX1b-O0UYZXZR_TwXsNsb9vtjPq_u_wzo7A_Gf2_p7UT0XgGag9OV
link.rule.ids 315,786,790,12792,21416,27957,27958,33408,33779,41116,41558,42185,42627,43635,43840,52146,52269,74392,74659
linkProvider ProQuest
linkToHtml http://utb.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwfV1LT8JAEN4oHvTi24iizsGbNkLbXbYnQ4yIClzEhFuz3d0aY0PRwv93pmwhmuill6ZzmOm8Hx9jlz73eZAI7jVTfNABLC-R0npBlEjDQyValjq6g6HovYZPYz52BbfCjVVWNrE01CbXVCO_QdccUWwgxO300yPUKOquOgiNdbYRBug6aVO8-7C0xJirl4h5mGN4QkRjtzTjVudCGhbyMZkmRGuP_3RMq2jzV4O09DvdXbbtAkboLCS8x9bsZJ_tVGAM4HTzgE07UCC_aRMK6J53ltkMCB3bFhYW9zCA1ix1lk9pWwSW1TRA05O_U80AFnjSQK7N4EdQwo7ATL0VoCYGXpDwB-hsTtcVDtlr93501_McnIKnUc9mqA-hlilG00K22u3UV7zdTBPNNQYpSrVSZTDV075Br-lbGs4PpJaRakcqldRdDI5YbZJP7DED1URBSsETm4oQKagwMC3LuQm18VMh6uyq4mU8XVzNiFf3kYnzMXI-Ljkf8zprVOyOnQYV8UredXZdiWD1-m9qJ_9Tu2CbvdGgH_cfh8-nbMunP6CczWuw2uxrbs8wxpgl5-WP9A3IH80w
linkToPdf http://utb.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwfV1JS8NAFB5cQLy4i3V9B28abJPMZHKSuhTXIi7QW5jMImJoq2n_v--lE4uCXnoJfYe3v3nLx9hhyEMe5YIHTYc_dAAryKW0QZTm0vBYiZalju59V1y9xDc93vPzT6Ufq6x9YuWozUDTG_kJhuaUcgMhTpwfi3i46JwOPwJCkKJOq4fTmGXzSSw4avj82WX34fHbL2PlXuHnYcURCJH2_AqNX6SLaXQoxNKa8K0D_jNMTXPPX-3SKgp1VtiSTx-hPZH3Kpux_TW2XEMzgLfUdTZsQ4ncp70ooOveRWELIKxsW1qYXMcAWrrUxWBIuyPw_bYG6IgGb_SCABN0aaBAZ_BPUIGQwEi9lqD6Bp6Q8DvoYky3FjbYS-fy-fwq8OAKgUarG6F1xFo6zK2FbCWJCxVPmi7XXGPKolTLKYOFnw4NxtDQ0qh-JLVMVZIqJ6nXGG2yuf6gb7cYqCaKVQqeWydipKDiyLQs5ybWJnRCNNhRzctsOLmhkU2vJRPnM-R8VnE-4w22W7M78_ZUZlPpN9hxLYLp57-pbf9P7YAtoBZld9fd2x22GJICVIN6u2xu9Dm2e5hwjPJ9r0lf-D_S0w
openUrl ctx_ver=Z39.88-2004&ctx_enc=info%3Aofi%2Fenc%3AUTF-8&rfr_id=info%3Asid%2Fsummon.serialssolutions.com&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Ajournal&rft.genre=article&rft.atitle=A+scalable+parallel+Chinese+online+encyclopedia+knowledge+denoising+method+based+on+entry+tags+and+Spark+cluster&rft.jtitle=Applied+intelligence+%28Dordrecht%2C+Netherlands%29&rft.au=Wang%2C+Ting&rft.au=Li%2C+Jie&rft.au=Guo%2C+Jiale&rft.date=2021-10-01&rft.issn=0924-669X&rft.eissn=1573-7497&rft.volume=51&rft.issue=10&rft.spage=7573&rft.epage=7599&rft_id=info:doi/10.1007%2Fs10489-021-02295-5&rft.externalDBID=n%2Fa&rft.externalDocID=10_1007_s10489_021_02295_5
thumbnail_l http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/lc.gif&issn=0924-669X&client=summon
thumbnail_m http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/mc.gif&issn=0924-669X&client=summon
thumbnail_s http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/sc.gif&issn=0924-669X&client=summon