A scalable parallel Chinese online encyclopedia knowledge denoising method based on entry tags and Spark cluster
Because of the open-collaborative of online encyclopedias, a large number of knowledge triples are improperly classified in online encyclopedia systems, and it is necessary to denoise and refine the open-domain encyclopedia Knowledge Bases (KBs) to improve the quality and precision. However, the lac...
Saved in:
Published in | Applied intelligence (Dordrecht, Netherlands) Vol. 51; no. 10; pp. 7573 - 7599 |
---|---|
Main Authors | , , |
Format | Journal Article |
Language | English |
Published |
New York
Springer US
01.10.2021
Springer Nature B.V |
Subjects | |
Online Access | Get full text |
Cover
Loading…
Abstract | Because of the open-collaborative of online encyclopedias, a large number of knowledge triples are improperly classified in online encyclopedia systems, and it is necessary to denoise and refine the open-domain encyclopedia Knowledge Bases (KBs) to improve the quality and precision. However, the lack and inaccuracy of triple semantic features lead to a poor refinement effect. In addition, considering large-scale encyclopedia KBs, the processing of massive knowledge will lead to too much computing time and poor scalability of the algorithm. To solve the problems of knowledge denoising in the Chinese encyclopedia system, first, based on data field theory, this paper proposes a new Cartesian product mapping-based method (TripleES) to calculate the semantic similarity of entity triples, based on which a method for quantifying the quality of entry tags is proposed. Second, to further improve the denoising effect on KBs, this paper proposes a new method (TriplePV) to compute the potential value of triple based on multi-feature fusion strategy to calculate the semantic distance between the “out-of-vocabulary” entry tags and embeds it into the potential function. Third, to ensure our algorithms have good scalability, the proposed denoising algorithms are implemented and optimized in parallel based on the Spark cluster-computing framework. Specifically, Spark-based TripleES (ES_Spark) and Spark-based TriplePV (PV_Spark) algorithms are proposed to calculate the semantic similarity and potential value of triples respectively. Finally, a comprehensive comparative analysis is performed on the denoising effect and time efficiency with the state-of-the-art distributed Chinese encyclopedia knowledge denoising algorithm. The experimental results on real-world datasets show that the parallel denoising algorithm proposed in this paper can improve the efficiency of knowledge denoising and the accuracy of KBs, which outperforms the state-of-the-art methods. |
---|---|
AbstractList | Because of the open-collaborative of online encyclopedias, a large number of knowledge triples are improperly classified in online encyclopedia systems, and it is necessary to denoise and refine the open-domain encyclopedia Knowledge Bases (KBs) to improve the quality and precision. However, the lack and inaccuracy of triple semantic features lead to a poor refinement effect. In addition, considering large-scale encyclopedia KBs, the processing of massive knowledge will lead to too much computing time and poor scalability of the algorithm. To solve the problems of knowledge denoising in the Chinese encyclopedia system, first, based on data field theory, this paper proposes a new Cartesian product mapping-based method (TripleES) to calculate the semantic similarity of entity triples, based on which a method for quantifying the quality of entry tags is proposed. Second, to further improve the denoising effect on KBs, this paper proposes a new method (TriplePV) to compute the potential value of triple based on multi-feature fusion strategy to calculate the semantic distance between the “out-of-vocabulary” entry tags and embeds it into the potential function. Third, to ensure our algorithms have good scalability, the proposed denoising algorithms are implemented and optimized in parallel based on the Spark cluster-computing framework. Specifically, Spark-based TripleES (ES_Spark) and Spark-based TriplePV (PV_Spark) algorithms are proposed to calculate the semantic similarity and potential value of triples respectively. Finally, a comprehensive comparative analysis is performed on the denoising effect and time efficiency with the state-of-the-art distributed Chinese encyclopedia knowledge denoising algorithm. The experimental results on real-world datasets show that the parallel denoising algorithm proposed in this paper can improve the efficiency of knowledge denoising and the accuracy of KBs, which outperforms the state-of-the-art methods. |
Author | Guo, Jiale Wang, Ting Li, Jie |
Author_xml | – sequence: 1 givenname: Ting orcidid: 0000-0003-2481-2890 surname: Wang fullname: Wang, Ting email: wangting@cueb.edu.cn organization: School of Management and Engineering, Capital University of Economics and Business – sequence: 2 givenname: Jie surname: Li fullname: Li, Jie organization: School of Software, Beijing Jiaotong University – sequence: 3 givenname: Jiale surname: Guo fullname: Guo, Jiale organization: School of Management and Engineering, Capital University of Economics and Business |
BookMark | eNp9kE1LAzEQhoMoWKt_wFPA82qS_cjmKMUvKHhQwVuYJrPttjFZky3Sf29sBW8ehpnD-7wDzxk59sEjIZecXXPG5E3irGpVwQTPI1Rd1EdkwmtZFrJS8phMmBJV0TTq_ZScpbRmjJUl4xMy3NJkwMHCIR0ggnPo6GzVe0xIg3f5oOjNzrgwoO2Bbnz4cmiXSC360KfeL-kHjqtg6QIS2gxlYIw7OsIyUfCWvuTiDTVum0aM5-SkA5fw4ndPydv93evssZg_PzzNbueFKbkaC8Eq03ZSiqblUnYCasm6halNUyoA3oFlqjLCIlqBBiyWrWkVSAVd2_KmKafk6tA7xPC5xTTqddhGn19qUTeKc6H2KXFImRhSitjpIfYfEHeaM_1jVh_M6mxW783qOkPlAUo57JcY_6r_ob4BVBN_dw |
CitedBy_id | crossref_primary_10_1007_s10489_022_04373_8 crossref_primary_10_1007_s10489_022_04448_6 crossref_primary_10_1007_s10489_022_03837_1 |
Cites_doi | 10.1609/aaai.v34i03.5684 10.1609/aaai.v28i1.8870 10.1007/s00778-015-0415-0 10.1145/2396761.2398406 10.1016/j.eswa.2020.113889 10.4018/IJSWIS.2019040102 10.1016/j.dss.2020.113346 10.1016/j.procs.2016.07.081 10.1609/aaai.v30i1.10314 10.1145/3269206.3271804 10.1609/aaai.v25i1.7917 10.1609/aaai.v33i01.3301297 10.1016/j.knosys.2016.08.015 10.1109/ACCESS.2019.2934747 10.1609/aaai.v30i1.10329 10.1007/978-3-319-15615-6_17 10.1145/2187836.2187899 10.14778/2535570.2488333 10.17632/wz6zmvjzb3.1 10.1186/s13638-018-1318-8 10.1109/BigData.2015.7363924 10.1109/CBD.2018.00010 10.1109/TKDE.2011.103 10.1007/s00521-018-3806-5 10.1201/9781315366951 10.1631/jzus.C1101008 10.1609/aaai.v34i03.5694 10.3233/SW-140134 10.1145/2588555.2610511 10.14778/2977797.2977806 10.1016/j.eswa.2016.09.009 10.1007/978-3-642-25093-4_14 10.1609/aaai.v29i1.9491 10.1007/978-3-319-10840-7_7 10.1007/978-3-642-04930-9_41 10.1016/j.jbi.2020.103435 10.1007/978-3-319-21042-1_27 10.1007/s11704-019-8264-4 10.3115/v1/P15-1067 |
ContentType | Journal Article |
Copyright | The Author(s), under exclusive licence to Springer Science+Business Media, LLC, part of Springer Nature 2021 The Author(s), under exclusive licence to Springer Science+Business Media, LLC, part of Springer Nature 2021. |
Copyright_xml | – notice: The Author(s), under exclusive licence to Springer Science+Business Media, LLC, part of Springer Nature 2021 – notice: The Author(s), under exclusive licence to Springer Science+Business Media, LLC, part of Springer Nature 2021. |
DBID | AAYXX CITATION 3V. 7SC 7WY 7WZ 7XB 87Z 8AL 8FD 8FE 8FG 8FK 8FL ABJCF ABUWG AFKRA ARAPS AZQEC BENPR BEZIV BGLVJ CCPQU DWQXO FRNLG F~G GNUQQ HCIFZ JQ2 K60 K6~ K7- L.- L6V L7M L~C L~D M0C M0N M7S P5Z P62 PQBIZ PQBZA PQEST PQQKQ PQUKI PRINS PSYQQ PTHSS Q9U |
DOI | 10.1007/s10489-021-02295-5 |
DatabaseName | CrossRef ProQuest Central (Corporate) Computer and Information Systems Abstracts ABI-INFORM Complete ABI/INFORM Global (PDF only) ProQuest Central (purchase pre-March 2016) ABI/INFORM Collection Computing Database (Alumni Edition) Technology Research Database ProQuest SciTech Collection ProQuest Technology Collection ProQuest Central (Alumni) (purchase pre-March 2016) ABI/INFORM Collection (Alumni Edition) Materials Science & Engineering Collection ProQuest Central (Alumni) ProQuest Central Advanced Technologies & Aerospace Database (1962 - current) ProQuest Central Essentials AUTh Library subscriptions: ProQuest Central ProQuest Business Premium Collection Technology Collection ProQuest One Community College ProQuest Central Business Premium Collection (Alumni) ABI/INFORM Global (Corporate) ProQuest Central Student SciTech Premium Collection (Proquest) (PQ_SDU_P3) ProQuest Computer Science Collection ProQuest Business Collection (Alumni Edition) ProQuest Business Collection Computer Science Database ABI/INFORM Professional Advanced ProQuest Engineering Collection Advanced Technologies Database with Aerospace Computer and Information Systems Abstracts Academic Computer and Information Systems Abstracts Professional ABI/INFORM Global (ProQuest) Computing Database Engineering Database Advanced Technologies & Aerospace Database ProQuest Advanced Technologies & Aerospace Collection ProQuest One Business ProQuest One Business (Alumni) ProQuest One Academic Eastern Edition (DO NOT USE) ProQuest One Academic ProQuest One Academic UKI Edition ProQuest Central China ProQuest One Psychology Engineering Collection ProQuest Central Basic |
DatabaseTitle | CrossRef ABI/INFORM Global (Corporate) ProQuest Business Collection (Alumni Edition) ProQuest One Business ProQuest One Psychology Computer Science Database ProQuest Central Student Technology Collection Technology Research Database Computer and Information Systems Abstracts – Academic ProQuest Advanced Technologies & Aerospace Collection ProQuest Central Essentials ProQuest Computer Science Collection Computer and Information Systems Abstracts ProQuest Central (Alumni Edition) SciTech Premium Collection ProQuest One Community College ProQuest Central China ABI/INFORM Complete ProQuest Central ABI/INFORM Professional Advanced ProQuest Engineering Collection ProQuest Central Korea Advanced Technologies Database with Aerospace ABI/INFORM Complete (Alumni Edition) Engineering Collection Advanced Technologies & Aerospace Collection Business Premium Collection ABI/INFORM Global ProQuest Computing Engineering Database ABI/INFORM Global (Alumni Edition) ProQuest Central Basic ProQuest Computing (Alumni Edition) ProQuest One Academic Eastern Edition ProQuest Technology Collection ProQuest SciTech Collection ProQuest Business Collection Computer and Information Systems Abstracts Professional Advanced Technologies & Aerospace Database ProQuest One Academic UKI Edition Materials Science & Engineering Collection ProQuest One Business (Alumni) ProQuest One Academic ProQuest Central (Alumni) Business Premium Collection (Alumni) |
DatabaseTitleList | ABI/INFORM Global (Corporate) |
Database_xml | – sequence: 1 dbid: 8FG name: ProQuest Technology Collection url: https://search.proquest.com/technologycollection1 sourceTypes: Aggregation Database |
DeliveryMethod | fulltext_linktorsrc |
Discipline | Computer Science |
EISSN | 1573-7497 |
EndPage | 7599 |
ExternalDocumentID | 10_1007_s10489_021_02295_5 |
GroupedDBID | -4Z -59 -5G -BR -EM -Y2 -~C -~X .86 .DC .VR 06D 0R~ 0VY 1N0 1SB 2.D 203 23M 28- 2J2 2JN 2JY 2KG 2LR 2P1 2VQ 2~H 30V 3V. 4.4 406 408 409 40D 40E 5GY 5QI 5VS 67Z 6NX 77K 7WY 8FE 8FG 8FL 8TC 8UJ 95- 95. 95~ 96X AAAVM AABHQ AABYN AAFGU AAHNG AAIAL AAJKR AANZL AAOBN AAPBV AARHV AARTL AATNV AATVU AAUYE AAWCG AAWWR AAYFA AAYIU AAYQN AAYTO ABBBX ABBXA ABDZT ABECU ABFGW ABFTV ABHLI ABHQN ABIVO ABJCF ABJNI ABJOX ABKAS ABKCH ABKTR ABMNI ABMQK ABNWP ABQBU ABSXP ABTAH ABTEG ABTHY ABTKH ABTMW ABULA ABUWG ABWNU ABXPI ACBMV ACBRV ACBXY ACBYP ACGFS ACHSB ACHXU ACIGE ACIPQ ACIWK ACKNC ACMDZ ACMLO ACOKC ACOMO ACSNA ACTTH ACVWB ACWMK ADGRI ADHHG ADHIR ADIMF ADINQ ADKNI ADKPE ADMDM ADOXG ADRFC ADTPH ADURQ ADYFF ADZKW AEBTG AEEQQ AEFIE AEFTE AEGAL AEGNC AEJHL AEJRE AEKMD AENEX AEOHA AEPYU AESKC AESTI AETLH AEVLU AEVTX AEXYK AEYWE AFEXP AFGCZ AFKRA AFLOW AFNRJ AFQWF AFWTZ AFZKB AGAYW AGDGC AGGBP AGGDS AGJBK AGMZJ AGQMX AGWIL AGWZB AGYKE AHAVH AHBYD AHKAY AHSBF AHYZX AIAKS AIIXL AILAN AIMYW AITGF AJBLW AJDOV AJRNO AJZVZ AKQUC ALMA_UNASSIGNED_HOLDINGS ALWAN AMKLP AMXSW AMYLF AMYQR AOCGG ARAPS ARMRJ ASPBG AVWKF AXYYD AYJHY AZFZN AZQEC B-. BA0 BBWZM BDATZ BENPR BEZIV BGLVJ BGNMA BPHCQ CAG CCPQU COF CS3 CSCUP DDRTE DL5 DNIVK DPUIP DWQXO EBLON EBS EIOEI EJD ESBYG FEDTE FERAY FFXSO FIGPU FINBP FNLPD FRNLG FRRFC FSGXE FWDCC GGCAI GGRSB GJIRD GNUQQ GNWQR GQ6 GQ7 GQ8 GROUPED_ABI_INFORM_COMPLETE GXS HCIFZ HF~ HG5 HG6 HMJXF HQYDN HRMNR HVGLF HZ~ I09 IHE IJ- IKXTQ ITM IWAJR IXC IZIGR IZQ I~X I~Z J-C J0Z JBSCW JCJTX JZLTJ K60 K6V K6~ K7- KDC KOV KOW L6V LAK LLZTM M0C M0N M4Y M7S MA- N2Q N9A NB0 NDZJH NPVJJ NQJWS NU0 O9- O93 O9G O9I O9J OAM OVD P19 P2P P62 P9O PF0 PQBIZ PQQKQ PROAC PSYQQ PT4 PT5 PTHSS Q2X QOK QOS R4E R89 R9I RHV RNI RNS ROL RPX RSV RZC RZE RZK S16 S1Z S26 S27 S28 S3B SAP SCJ SCLPG SCO SDH SDM SHX SISQX SJYHP SNE SNPRN SNX SOHCF SOJ SPISZ SRMVM SSLCW STPWE SZN T13 T16 TEORI TSG TSK TSV TUC U2A UG4 UNUBA UOJIU UTJUX UZXMN VC2 VFIZW W23 W48 WK8 YLTOR Z45 Z5O Z7R Z7S Z7X Z7Y Z7Z Z81 Z83 Z86 Z87 Z88 Z8M Z8N Z8R Z8S Z8T Z8U Z8W Z91 Z92 ZMTXR ZY4 ~A9 ~EX AACDK AAEOY AAJBT AASML AAYXX ABAKF ACAOD ACDTI ACZOJ AEFQL AEMSY AFBBN AGQEE AGRTI AIGIU CITATION H13 PQBZA 7SC 7XB 8AL 8FD 8FK JQ2 L.- L7M L~C L~D PQEST PQUKI PRINS Q9U |
ID | FETCH-LOGICAL-c319t-204c8f77268177f2a570fbc5c639aa1fad094c2deed2ecade38c89a79af881663 |
IEDL.DBID | 8FG |
ISSN | 0924-669X |
IngestDate | Fri Sep 13 02:05:47 EDT 2024 Thu Sep 12 16:55:12 EDT 2024 Sat Dec 16 12:09:04 EST 2023 |
IsPeerReviewed | true |
IsScholarly | true |
Issue | 10 |
Keywords | Parallel computing Knowledge base Potential function Semantic distance Knowledge denoising Chinese online encyclopedia |
Language | English |
LinkModel | DirectLink |
MergedId | FETCHMERGED-LOGICAL-c319t-204c8f77268177f2a570fbc5c639aa1fad094c2deed2ecade38c89a79af881663 |
ORCID | 0000-0003-2481-2890 |
PQID | 2569112966 |
PQPubID | 326365 |
PageCount | 27 |
ParticipantIDs | proquest_journals_2569112966 crossref_primary_10_1007_s10489_021_02295_5 springer_journals_10_1007_s10489_021_02295_5 |
PublicationCentury | 2000 |
PublicationDate | 2021-10-01 |
PublicationDateYYYYMMDD | 2021-10-01 |
PublicationDate_xml | – month: 10 year: 2021 text: 2021-10-01 day: 01 |
PublicationDecade | 2020 |
PublicationPlace | New York |
PublicationPlace_xml | – name: New York – name: Boston |
PublicationSubtitle | The International Journal of Research on Intelligent Systems for Real Life Complex Problems |
PublicationTitle | Applied intelligence (Dordrecht, Netherlands) |
PublicationTitleAbbrev | Appl Intell |
PublicationYear | 2021 |
Publisher | Springer US Springer Nature B.V |
Publisher_xml | – name: Springer US – name: Springer Nature B.V |
References | WangZWangZLiJPanJZKnowledge extraction from chinese wiki encyclopediasJ Zhejiang Univ Sci C201213426828010.1631/jzus.C1101008 Torre-Bastida AI, Villar-Rodriguez E, Del Ser J, Camacho D, Gonzalez-Rodriguez M (2014) On interlinking linked data sources by using ontology matching techniques and the map-reduce framework. In: International Conference on Intelligent Data Engineering and Automated Learning, pp 53–60 Bordes A, Usunier N, Garcia-Duran A, Weston J, Yakhnenko O (2013) Translating embeddings for modeling multi-relational data. In: Neural Information Processing Systems (NIPS), pp 2787–2795 Wang Z, Zhang J, Feng J, Chen Z (2014) Knowledge graph embedding by translating on hyperplanes. In: Proceedings of the AAAI Conference on Artificial Intelligence, vol 28, pp 1112–1119 WuTQiGLuoBZhangLWangHLanguage-independent type inference of the instances from multilingual wikipediaInt J Semant Web Inf Syst (IJSWIS)2019152224610.4018/IJSWIS.2019040102 LehmannJIseleRJakobMJentzschAKontokostasDMendesPNHellmannSMorseyMvan kleef, PAuerSDbpedia–a large-scale, multilingual knowledge base extracted from wikipediaSemantic web20156216719510.3233/SW-140134 HusainMMcGlothlinJMasudMMKhanLThuraisinghamBMHeuristics-based query processing for large rdf graphs using cloud computingIEEE Trans Knowl Data Eng20112391312132710.1109/TKDE.2011.103 Wang T, Li J, Guo J, Xie J (2019) A novel large-scale chinese encyclopedia knowledge parallel refining method based on mapreduce. IEEE Access 7:111840–111857 NizzoliLAvvenutiMTesconiMCresciSGeo-semantic-parsing: Ai-powered geoparsing by traversing semantic knowledge graphsDecis Support Syst202013611610.1016/j.dss.2020.113346 Ahn J, Im D, Eom J, Zong N, Kim H (2014) G-diff: a grouping algorithm for rdf change detection on mapreduce. In: Joint International Semantic Technology Conference, pp 230–235 Socher R, Chen D, Manning CD, Ng A (2013) Reasoning with neural tensor networks for knowledge base completion. In: Advances in neural information processing systems, pp 926–934 Li D, Du Y (2017) Artificial intelligence with uncertainty, 2nd edn. CRC Press, Boca Raton Pershina M, Yakout M, Chakrabarti K (2015) Holistic entity matching across knowledge graphs. In: 2015 IEEE International Conference on Big Data (Big Data), pp 1585–1590 Wang Z, Li J, Wang Z, Tang J (2012) Cross-lingual knowledge linking across wiki knowledge bases. In: Proceedings of the 21st international conference on World Wide Web, pp 459–468 Trisedya BD, Qi J, Zhang R (2019) Entity alignment between knowledge graphs using attribute embeddings. In: Proceedings of the AAAI Conference on Artificial Intelligence, vol 33, pp 297–304 PengPZouLÖzsuMTChenLZhaoDProcessing sparql queries over distributed rdf graphsThe VLDB J201625224326810.1007/s00778-015-0415-0 LeeTImD-HWonJSimilarity-based change detection for rdf in mapreduceProcedia Comput Sci20169178979710.1016/j.procs.2016.07.081 Wang T Knowledge base for baidubaike. https://doi.org/10.17632/wz6zmvjzb3.1 WangTGuHWuZGaoJMulti-source knowledge integration based on machine learning algorithms for domain ontologyNeural Comput Appl202032123524510.1007/s00521-018-3806-5 Gurajada S, Seufert S, Miliaraki I, Theobald M (2014) Triad: a distributed shared-nothing rdf engine based on asynchronous message passing. In: Proceedings of the 2014 ACM SIGMOD international conference on Management of data, pp 289–300 Wang Z, Li J, Liu Z, Tang J (2016) Text-enhanced representation learning for knowledge graph. In: Proceedings of International Joint Conference on Artificial Intelligent (IJCAI), pp 1293–1299 Xie R, Liu Z, Jia J, Luan H, Sun M (2016) Representation learning of knowledge graphs with entity descriptions. In: Proceedings of the AAAI Conference on Artificial Intelligence, vol 30, pp 2659–2665 SchätzleAPrzyjaciel-ZablockiMSkilevicSLausenGS2rdf: Rdf querying with sparql on sparkProc VLDB Endow201691080481510.14778/2977797.2977806https://doi.org/10.14778/2977797.2977806 Wang Z, Li J, Tang J (2013) Boosting cross-lingual knowledge linking via concept annotation. In: Twenty-Third International Joint Conference on Artificial Intelligence, pp 2733–2739 Lin Y, Liu Z, Sun M, Liu Y, Zhu X (2015) Learning entity and relation embeddings for knowledge graph completion. In: Proceedings of the AAAI Conference on Artificial Intelligence, vol 29, pp 2181–2187 Niu X, Sun X, Wang H, Rong S, Qi G, Yu Y (2011) Zhishi. me-weaving chinese linking open data. In: International Semantic Web Conference, pp 205–220 XuJZhangCSemantic connection set-based massive rdf data query processing in spark environmentEURASIP J Wirel Commun Netw20192019111010.1186/s13638-018-1318-8 Li Y, Du G, Xiang Y, Li S, Ma L, Shao D, Wang X, Chen H (2020) Towards chinese clinical named entity recognition by dynamic embedding using domain-specific knowledge. J Biomed Inf:1–9 Khadilkar V, Kantarcioglu M, Thuraisingham B, Castagna P (2012) Jena-hbase: A distributed, scalable and efficient rdf triple store. In: Proceedings of the 11th International Semantic Web Conference Posters & Demonstrations Track, ISWC-PD, Vol 12, pp 85–88 GuRWangSGuoCYuanCHuangYLarge scale semantic rule-based backward chaining reasoning on sparkJ Chin Inf Process2018323120134 WangXXuQChaiLYangYChaiYEfficient distributed query processing on large scale rdf graph dataRuan Jian Xue Bao/J Softw20193034985141438.68046 Malaviya C, Bhagavatula C, Bosselut A, Choi Y (2020) Commonsense knowledge base completion with structural and semantic context. In: Proceedings of the AAAI Conference on Artificial Intelligence, vol . 34, pp 2925–2933 Volz J, Bizer C, Gaedke M, Kobilarov G (2009) Discovering and maintaining links on the web of data Hutchison D, Kanade T, Kittler J, Kleinberg JM, Mattern F, Mitchell JC, Naor M, Nierstrasz O, Pandu Rangan C, Steffen B, Sudan M, Terzopoulos D, Tygar D, Vardi MY, Weikum G, Bernstein A, Karger DR, Heath T, Feigenbaum L, Maynard D, Motta E, Thirunarayan K (eds), Springer, Berlin. https://doi.org/10.1007/978-3-642-04930-9_41 Bordes A, Weston J, Collobert R, Bengio Y (2011) Learning structured embeddings of knowledge bases. In: Proceedings of the AAAI Conference on Artificial Intelligence, vol 25, pp 301–306 LiuFShenYZhangTGaoHEntity-related paths modeling for knowledge base completionFront Comput Sci2020145110https://doi.org/10.1007/s11704-019-8264-4 ChenTLiuWZhuQSinopedia: an unified chinese terminology service platform based on linked dataJ Libr Sci China2018444418 WangYWuCTsaiRTCross-language article linking with different knowledge bases using bilingual topic model and translation featuresKnowl-Based Syst201611122823610.1016/j.knosys.2016.08.015 ChenXJiaSDingLShenHXiangYSdt: an integrated model for open-world knowledge graph reasoningExpert Syst Appl20201621910.1016/j.eswa.2020.113889https://doi.org/10.1016/j.eswa.2020.113889 Xu B, Luo Z, Huang L, Liang B, Xiao Y, Yang D, Wang W (2018) Metic: Multi-instance entity typing from corpus. In: Proceedings of the 27th ACM International Conference on Information and Knowledge Management, pp 903–912 Niu X, Rong S, Wang H, Yu Y (2012) An effective rule miner for instance matching in a web of data. In: Proceedings of the 21st ACM international conference on Information and knowledge management, pp 1085–1094 ChenKZhangZLongJZhangHTurning from tf-idf to tf-igm for term weighting in text classificationExpert Syst Appl20166624526010.1016/j.eswa.2016.09.009 Xiong Z, Zhu G, Yu W, Wang S, Chong Z (2018) Load-balanced cluster for scale-out storage of knowledge. In: 2018 Sixth International Conference on Advanced Cloud and Big Data (CBD), pp 1–5. https://doi.org/10.1109/CBD.2018.00010 Ji G, He S, Xu L, Liu K, Zhao J (2015) Knowledge graph embedding via dynamic mapping matrix. In: Proceedings of the 53rd annual meeting of the association for computational linguistics and the 7th international joint conference on natural language processing (volume 1: Long papers), pp 687–696 Nickel M, Rosasco L, Poggio T (2016) Holographic embeddings of knowledge graphs. In: Proceedings of the AAAI Conference on Artificial Intelligence, vol 30, pp 1955–1961 Vashishth S, Sanyal S, Nitin V, Agrawal N, Talukdar P (2020) Interacte: Improving convolution-based knowledge graph embeddings by increasing feature interactions. In: Proceedings of the AAAI Conference on Artificial Intelligence, vol 34, pp 3009–3016 ZengKYangJWangHShaoBWangZA distributed graph engine for web scale rdf dataProc VLDB Endow20136426527610.14778/2535570.2488333 Xu Z, Chen W, Gai L, Wang T (2015) Sparkrdf: In-memory distributed rdf management framework for large-scale social data. In: International Conference on Web-Age Information Management, pp 337–349 T Wu (2295_CR31) 2019; 15 X Chen (2295_CR27) 2020; 162 2295_CR7 K Zeng (2295_CR34) 2013; 6 2295_CR6 2295_CR41 T Chen (2295_CR9) 2018; 44 M Husain (2295_CR33) 2011; 23 2295_CR26 2295_CR24 2295_CR46 2295_CR25 2295_CR47 F Liu (2295_CR5) 2020; 14 2295_CR22 A Schätzle (2295_CR40) 2016; 9 2295_CR44 2295_CR23 2295_CR20 2295_CR42 2295_CR21 P Peng (2295_CR37) 2016; 25 L Nizzoli (2295_CR28) 2020; 136 T Wang (2295_CR3) 2020; 32 2295_CR29 K Chen (2295_CR4) 2016; 66 Y Wang (2295_CR14) 2016; 111 2295_CR30 X Wang (2295_CR38) 2019; 30 2295_CR15 2295_CR16 2295_CR1 2295_CR13 2295_CR35 J Xu (2295_CR39) 2019; 2019 2295_CR36 2295_CR11 J Lehmann (2295_CR2) 2015; 6 2295_CR12 2295_CR10 2295_CR32 R Gu (2295_CR43) 2018; 32 Z Wang (2295_CR8) 2012; 13 2295_CR19 2295_CR17 2295_CR18 T Lee (2295_CR45) 2016; 91 |
References_xml | – ident: 2295_CR21 doi: 10.1609/aaai.v34i03.5684 – ident: 2295_CR16 doi: 10.1609/aaai.v28i1.8870 – volume: 25 start-page: 243 issue: 2 year: 2016 ident: 2295_CR37 publication-title: The VLDB J doi: 10.1007/s00778-015-0415-0 contributor: fullname: P Peng – ident: 2295_CR10 doi: 10.1145/2396761.2398406 – volume: 162 start-page: 1 year: 2020 ident: 2295_CR27 publication-title: Expert Syst Appl doi: 10.1016/j.eswa.2020.113889 contributor: fullname: X Chen – volume: 15 start-page: 22 issue: 2 year: 2019 ident: 2295_CR31 publication-title: Int J Semant Web Inf Syst (IJSWIS) doi: 10.4018/IJSWIS.2019040102 contributor: fullname: T Wu – volume: 136 start-page: 1 year: 2020 ident: 2295_CR28 publication-title: Decis Support Syst doi: 10.1016/j.dss.2020.113346 contributor: fullname: L Nizzoli – volume: 91 start-page: 789 year: 2016 ident: 2295_CR45 publication-title: Procedia Comput Sci doi: 10.1016/j.procs.2016.07.081 contributor: fullname: T Lee – ident: 2295_CR26 doi: 10.1609/aaai.v30i1.10314 – ident: 2295_CR30 doi: 10.1145/3269206.3271804 – ident: 2295_CR23 doi: 10.1609/aaai.v25i1.7917 – ident: 2295_CR32 – ident: 2295_CR13 – ident: 2295_CR20 doi: 10.1609/aaai.v33i01.3301297 – volume: 111 start-page: 228 year: 2016 ident: 2295_CR14 publication-title: Knowl-Based Syst doi: 10.1016/j.knosys.2016.08.015 contributor: fullname: Y Wang – ident: 2295_CR1 doi: 10.1109/ACCESS.2019.2934747 – ident: 2295_CR19 doi: 10.1609/aaai.v30i1.10329 – ident: 2295_CR44 doi: 10.1007/978-3-319-15615-6_17 – ident: 2295_CR12 doi: 10.1145/2187836.2187899 – volume: 6 start-page: 265 issue: 4 year: 2013 ident: 2295_CR34 publication-title: Proc VLDB Endow doi: 10.14778/2535570.2488333 contributor: fullname: K Zeng – ident: 2295_CR47 doi: 10.17632/wz6zmvjzb3.1 – volume: 2019 start-page: 1 issue: 1 year: 2019 ident: 2295_CR39 publication-title: EURASIP J Wirel Commun Netw doi: 10.1186/s13638-018-1318-8 contributor: fullname: J Xu – ident: 2295_CR11 doi: 10.1109/BigData.2015.7363924 – ident: 2295_CR41 doi: 10.1109/CBD.2018.00010 – volume: 32 start-page: 120 issue: 3 year: 2018 ident: 2295_CR43 publication-title: J Chin Inf Process contributor: fullname: R Gu – ident: 2295_CR25 – volume: 23 start-page: 1312 issue: 9 year: 2011 ident: 2295_CR33 publication-title: IEEE Trans Knowl Data Eng doi: 10.1109/TKDE.2011.103 contributor: fullname: M Husain – volume: 32 start-page: 235 issue: 1 year: 2020 ident: 2295_CR3 publication-title: Neural Comput Appl doi: 10.1007/s00521-018-3806-5 contributor: fullname: T Wang – ident: 2295_CR46 doi: 10.1201/9781315366951 – volume: 13 start-page: 268 issue: 4 year: 2012 ident: 2295_CR8 publication-title: J Zhejiang Univ Sci C doi: 10.1631/jzus.C1101008 contributor: fullname: Z Wang – ident: 2295_CR22 doi: 10.1609/aaai.v34i03.5694 – volume: 6 start-page: 167 issue: 2 year: 2015 ident: 2295_CR2 publication-title: Semantic web doi: 10.3233/SW-140134 contributor: fullname: J Lehmann – ident: 2295_CR35 doi: 10.1145/2588555.2610511 – volume: 30 start-page: 498 issue: 3 year: 2019 ident: 2295_CR38 publication-title: Ruan Jian Xue Bao/J Softw contributor: fullname: X Wang – volume: 9 start-page: 804 issue: 10 year: 2016 ident: 2295_CR40 publication-title: Proc VLDB Endow doi: 10.14778/2977797.2977806 contributor: fullname: A Schätzle – ident: 2295_CR15 – volume: 66 start-page: 245 year: 2016 ident: 2295_CR4 publication-title: Expert Syst Appl doi: 10.1016/j.eswa.2016.09.009 contributor: fullname: K Chen – ident: 2295_CR7 doi: 10.1007/978-3-642-25093-4_14 – ident: 2295_CR17 doi: 10.1609/aaai.v29i1.9491 – ident: 2295_CR42 doi: 10.1007/978-3-319-10840-7_7 – ident: 2295_CR6 doi: 10.1007/978-3-642-04930-9_41 – ident: 2295_CR29 doi: 10.1016/j.jbi.2020.103435 – volume: 44 start-page: 4 issue: 4 year: 2018 ident: 2295_CR9 publication-title: J Libr Sci China contributor: fullname: T Chen – ident: 2295_CR36 doi: 10.1007/978-3-319-21042-1_27 – volume: 14 start-page: 1 issue: 5 year: 2020 ident: 2295_CR5 publication-title: Front Comput Sci doi: 10.1007/s11704-019-8264-4 contributor: fullname: F Liu – ident: 2295_CR18 doi: 10.3115/v1/P15-1067 – ident: 2295_CR24 |
SSID | ssj0003301 |
Score | 2.3310199 |
Snippet | Because of the open-collaborative of online encyclopedias, a large number of knowledge triples are improperly classified in online encyclopedia systems, and it... |
SourceID | proquest crossref springer |
SourceType | Aggregation Database Publisher |
StartPage | 7573 |
SubjectTerms | Algorithms Artificial Intelligence Cartesian coordinates Clusters Computer Science Computing time Encyclopedias Field theory Knowledge Knowledge bases (artificial intelligence) Machines Manufacturing Mathematical analysis Mechanical Engineering Noise reduction Processes Semantics Similarity Tags |
SummonAdditionalLinks | – databaseName: SpringerLink Journals (ICM) dbid: U2A link: http://utb.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwlV07T8MwELagLCy8EYWCbmCDSI0TO85YIaoKCRao1C1ybAchorQi6f_nLo8GEAwsWZLccPb5vvM9PsauueAiSKXwxhk-aACWlyrlvCBOlRWhlr6jjO7jk5zNw4eFWPR93HWxe5eRrA_qL71uIVX3cIx-iYLaE9tsh8ADbeU5n2yOXwzQa5o8DCw8KeNF2ynzu4zv3qiHmD-yorWzmR6wvRYlwqRZ1kO25Yojtt8xMEBrkMdsNYESlUztT0BDvPPc5UCU2K500AzBAOqtNPlyRS0isLlCAzxvlm90UQANiTSQP7P4E9RcI1Dp1xJ0YeEZBb-Dydc0UuGEzaf3L3czr-VQ8AwaV4VGEBqVIYSWyo-ijGsRjbPUCIPIRGs_0xbjO8MtukruqCI_UEbFOop1piilGJyyQbEs3BkDPcbVU1KkLpMhStBhYH0nhA2N5ZmUQ3bT6TJZNaMykn4oMmk-Qc0nteYTMWSjTt1JazZlgvgrJgBIwm67Jehf_y3t_H-fX7BdTrugLsobsUH1sXaXCC6q9KreTJ9YmcbC priority: 102 providerName: Springer Nature |
Title | A scalable parallel Chinese online encyclopedia knowledge denoising method based on entry tags and Spark cluster |
URI | https://link.springer.com/article/10.1007/s10489-021-02295-5 https://www.proquest.com/docview/2569112966/abstract/ |
Volume | 51 |
hasFullText | 1 |
inHoldings | 1 |
isFullTextHit | |
isPrint | |
link | http://utb.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwfV1LT9tAEB7xuHCBFooIUDSH3oqVeO3drE9VUiVEVEQVNFI4WevdNUJYSSDO_2fGsYlAKhfvwfIcZjzPnZkP4IeQQkaZkkEnpwcvwAoyrX0QJZl2MjYq9HyjezNWo0l8PZXTLRg1szDcVtnYxMpQu7nlGnmbXHPCsYFSbZNxFcCW7V-L54Dxo_ietQbT2IbdkHfi8cz48OrNJlPWXmHnUbYRKJVM6_GZeogu5rYhQWk1Y1sH8r2L2sSdH65KKw80_AL7deiIvbWsv8KWnx3CQQPLgLWWHsGih0viPM9EIW_2LgpfIONk-6XH9WYM5IFLW8wXPDeCb3U1JCM0f-TqAa6RpZGdnKOPsAIgwdI8LNHMHN4R4Se0xYr3LHyDyXDw7_coqIEVAksaV5JmxFbnFFcrHXa7uTCy28kzKy2FK8aEuXGU9FnhyH8Kz236kbY6Md3E5JrvGaNj2JnNZ_4E0HRIpFrJzOcqJgomjlzopXSxdSJXqgU_G16mi_X-jHSzKZk5nxLn04rzqWzBecPutNalZbqRfAsuGxFsXv-f2unn1M5gT7DUq868c9gpX1b-O0UYZXZR_TwXsNsb9vtjPq_u_wzo7A_Gf2_p7UT0XgGag9OV |
link.rule.ids | 315,786,790,12792,21416,27957,27958,33408,33779,41116,41558,42185,42627,43635,43840,52146,52269,74392,74659 |
linkProvider | ProQuest |
linkToHtml | http://utb.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwfV1LT8JAEN4oHvTi24iizsGbNkLbXbYnQ4yIClzEhFuz3d0aY0PRwv93pmwhmuill6ZzmOm8Hx9jlz73eZAI7jVTfNABLC-R0npBlEjDQyValjq6g6HovYZPYz52BbfCjVVWNrE01CbXVCO_QdccUWwgxO300yPUKOquOgiNdbYRBug6aVO8-7C0xJirl4h5mGN4QkRjtzTjVudCGhbyMZkmRGuP_3RMq2jzV4O09DvdXbbtAkboLCS8x9bsZJ_tVGAM4HTzgE07UCC_aRMK6J53ltkMCB3bFhYW9zCA1ix1lk9pWwSW1TRA05O_U80AFnjSQK7N4EdQwo7ATL0VoCYGXpDwB-hsTtcVDtlr93501_McnIKnUc9mqA-hlilG00K22u3UV7zdTBPNNQYpSrVSZTDV075Br-lbGs4PpJaRakcqldRdDI5YbZJP7DED1URBSsETm4oQKagwMC3LuQm18VMh6uyq4mU8XVzNiFf3kYnzMXI-Ljkf8zprVOyOnQYV8UredXZdiWD1-m9qJ_9Tu2CbvdGgH_cfh8-nbMunP6CczWuw2uxrbs8wxpgl5-WP9A3IH80w |
linkToPdf | http://utb.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwfV1JS8NAFB5cQLy4i3V9B28abJPMZHKSuhTXIi7QW5jMImJoq2n_v--lE4uCXnoJfYe3v3nLx9hhyEMe5YIHTYc_dAAryKW0QZTm0vBYiZalju59V1y9xDc93vPzT6Ufq6x9YuWozUDTG_kJhuaUcgMhTpwfi3i46JwOPwJCkKJOq4fTmGXzSSw4avj82WX34fHbL2PlXuHnYcURCJH2_AqNX6SLaXQoxNKa8K0D_jNMTXPPX-3SKgp1VtiSTx-hPZH3Kpux_TW2XEMzgLfUdTZsQ4ncp70ooOveRWELIKxsW1qYXMcAWrrUxWBIuyPw_bYG6IgGb_SCABN0aaBAZ_BPUIGQwEi9lqD6Bp6Q8DvoYky3FjbYS-fy-fwq8OAKgUarG6F1xFo6zK2FbCWJCxVPmi7XXGPKolTLKYOFnw4NxtDQ0qh-JLVMVZIqJ6nXGG2yuf6gb7cYqCaKVQqeWydipKDiyLQs5ybWJnRCNNhRzctsOLmhkU2vJRPnM-R8VnE-4w22W7M78_ZUZlPpN9hxLYLp57-pbf9P7YAtoBZld9fd2x22GJICVIN6u2xu9Dm2e5hwjPJ9r0lf-D_S0w |
openUrl | ctx_ver=Z39.88-2004&ctx_enc=info%3Aofi%2Fenc%3AUTF-8&rfr_id=info%3Asid%2Fsummon.serialssolutions.com&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Ajournal&rft.genre=article&rft.atitle=A+scalable+parallel+Chinese+online+encyclopedia+knowledge+denoising+method+based+on+entry+tags+and+Spark+cluster&rft.jtitle=Applied+intelligence+%28Dordrecht%2C+Netherlands%29&rft.au=Wang%2C+Ting&rft.au=Li%2C+Jie&rft.au=Guo%2C+Jiale&rft.date=2021-10-01&rft.issn=0924-669X&rft.eissn=1573-7497&rft.volume=51&rft.issue=10&rft.spage=7573&rft.epage=7599&rft_id=info:doi/10.1007%2Fs10489-021-02295-5&rft.externalDBID=n%2Fa&rft.externalDocID=10_1007_s10489_021_02295_5 |
thumbnail_l | http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/lc.gif&issn=0924-669X&client=summon |
thumbnail_m | http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/mc.gif&issn=0924-669X&client=summon |
thumbnail_s | http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/sc.gif&issn=0924-669X&client=summon |