A systematic review and comparative analysis of cross-document coreference resolution methods and tools
Information extraction (IE) is the task of automatically extracting structured information from unstructured/semi-structured machine-readable documents. Among various IE tasks, extracting actionable intelligence from an ever-increasing amount of data depends critically upon cross-document coreferenc...
Saved in:
Published in | Computing Vol. 99; no. 4; pp. 313 - 349 |
---|---|
Main Authors | , , , , , |
Format | Journal Article |
Language | English |
Published |
Vienna
Springer Vienna
01.04.2017
Springer Nature B.V |
Subjects | |
Online Access | Get full text |
Cover
Loading…
Abstract | Information extraction (IE) is the task of automatically extracting structured information from unstructured/semi-structured machine-readable documents. Among various IE tasks, extracting actionable intelligence from an ever-increasing amount of data depends critically upon cross-document coreference resolution (CDCR) - the task of identifying entity mentions across information sources that refer to the same underlying entity. CDCR is the basis of knowledge acquisition and is at the heart of Web search, recommendations, and analytics. Real time processing of CDCR processes is very important and have various applications in discovering must-know information in real-time for clients in finance, public sector, news, and crisis management. Being an emerging area of research and practice, the reported literature on CDCR challenges and solutions is growing fast but is scattered due to the large space, various applications, and large datasets of the order of peta-/tera-bytes. In order to fill this gap, we provide a systematic review of the state of the art of challenges and solutions for a CDCR process. We identify a set of quality attributes, that have been frequently reported in the context of CDCR processes, to be used as a guide to identify important and outstanding issues for further investigations. Finally, we assess existing tools and techniques for CDCR subtasks and provide guidance on selection of tools and algorithms. |
---|---|
AbstractList | Information extraction (IE) is the task of automatically extracting structured information from unstructured/semi-structured machine-readable documents. Among various IE tasks, extracting actionable intelligence from an ever-increasing amount of data depends critically upon cross-document coreference resolution (CDCR) - the task of identifying entity mentions across information sources that refer to the same underlying entity. CDCR is the basis of knowledge acquisition and is at the heart of Web search, recommendations, and analytics. Real time processing of CDCR processes is very important and have various applications in discovering must-know information in real-time for clients in finance, public sector, news, and crisis management. Being an emerging area of research and practice, the reported literature on CDCR challenges and solutions is growing fast but is scattered due to the large space, various applications, and large datasets of the order of peta-/tera-bytes. In order to fill this gap, we provide a systematic review of the state of the art of challenges and solutions for a CDCR process. We identify a set of quality attributes, that have been frequently reported in the context of CDCR processes, to be used as a guide to identify important and outstanding issues for further investigations. Finally, we assess existing tools and techniques for CDCR subtasks and provide guidance on selection of tools and algorithms. |
Author | Benatallah, Boualem Beheshti, Seyed-Mehdi-Reza Venugopal, Srikumar Wang, Wei Ryu, Seung Hwan Motahari-Nezhad, Hamid Reza |
Author_xml | – sequence: 1 givenname: Seyed-Mehdi-Reza surname: Beheshti fullname: Beheshti, Seyed-Mehdi-Reza email: sbeheshti@cse.unsw.edu.au organization: School of Computer Science and Engineering, University of New South Wales – sequence: 2 givenname: Boualem surname: Benatallah fullname: Benatallah, Boualem organization: School of Computer Science and Engineering, University of New South Wales – sequence: 3 givenname: Srikumar surname: Venugopal fullname: Venugopal, Srikumar organization: School of Computer Science and Engineering, University of New South Wales – sequence: 4 givenname: Seung Hwan surname: Ryu fullname: Ryu, Seung Hwan organization: School of Computer Science and Engineering, University of New South Wales – sequence: 5 givenname: Hamid Reza surname: Motahari-Nezhad fullname: Motahari-Nezhad, Hamid Reza organization: School of Computer Science and Engineering, University of New South Wales, IBM Almaden Research Center – sequence: 6 givenname: Wei surname: Wang fullname: Wang, Wei organization: School of Computer Science and Engineering, University of New South Wales |
BookMark | eNp1kEtLxDAUhYMoOD5-gLuCGzfRm2SStksRXyC4UXAX0vR27NA2Y26rzL83M-NCBFf3wXcOnHPE9ocwIGNnAi4FQH5FAAZyDsJwmJfAYY_NxFwZrkHn-2wGIIDPC_12yI6IlgAgVVHO2OI6ozWN2Lux9VnEzxa_MjfUmQ_9ysX0_cR0u25NLWWhyXwMRLwOfupxGBMWscGIg8ekptBNYxuGrMfxPdS0dRpD6OiEHTSuIzz9mcfs9e725eaBPz3fP95cP3GvSjnyQnuFOk-7alwtC-G0MSlPVZW1kZWfSyGVVFikTTrTYKPBm8rUutKN8V4ds4ud7yqGjwlptH1LHrvODRgmsqIoVQkyB5HQ8z_oMkwxRd1QBRRGl1InSuyobfAU1q5i27u4tgLspnq7q96m6u2megtJI3caSuywwPjL-V_RN-UUiW4 |
CitedBy_id | crossref_primary_10_3390_make2030009 crossref_primary_10_1016_j_wpi_2018_10_002 crossref_primary_10_14778_3229863_3236230 crossref_primary_10_1017_S1351324920000443 crossref_primary_10_1007_s00766_022_00374_8 crossref_primary_10_1109_ACCESS_2020_3009445 crossref_primary_10_3390_bdcc2040033 crossref_primary_10_3390_e21040419 crossref_primary_10_3390_app13169272 crossref_primary_10_1007_s10619_018_7245_1 |
Cites_doi | 10.1145/356827.356830 10.1007/s10579-007-9044-6 10.1145/1105664.1105679 10.1109/MIC.2010.58 10.1075/li.30.1.03nad 10.1007/s10579-012-9194-z 10.1016/j.asoc.2009.12.025 10.1145/1327452.1327492 10.1145/219717.219748 10.1145/1010925.1010927 10.14778/2367502.2367527 10.4018/jswis.2009081901 10.1017/S1351324911000106 10.3115/1613715.1613795 10.1109/ICSC.2014.31 10.1162/tacl_a_00119 10.3115/1620754.1620778 10.1145/1242572.1242667 10.3115/1219840.1219885 10.3115/1218955.1219031 10.3115/1557690.1557767 10.3115/1219044.1219066 10.3115/1610075.1610158 10.3115/1219840.1219841 10.1007/s00453-001-0010-1 10.1145/1007568.1007652 10.1075/bct.19 10.1109/ITCC.2002.1000354 10.1007/978-3-642-03070-3_52 10.1137/1.9781611972795.32 10.3115/1613715.1613756 10.3115/1699571.1699635 10.3115/1218955.1218973 10.3115/1609067.1609072 10.1109/ESEM.2011.36 10.1007/s00778-008-0098-x 10.1145/2488388.2488411 10.1007/978-3-319-15350-6_3 10.3115/1220575.1220588 10.1145/1376616.1376726 10.3115/1072399.1072405 10.1109/ICDE.2002.994694 10.1109/ICDE.2011.5767865 10.1007/11573036_36 10.1186/1471-2105-9-S9-S11 10.1007/978-3-540-76298-0_52 10.1145/1376616.1376746 10.3115/1219840.1219917 10.3115/1220575.1220579 |
ContentType | Journal Article |
Copyright | Springer-Verlag Wien 2016 Computing is a copyright of Springer, 2017. |
Copyright_xml | – notice: Springer-Verlag Wien 2016 – notice: Computing is a copyright of Springer, 2017. |
DBID | AAYXX CITATION 0U~ 1-H 3V. 7SC 7WY 7WZ 7XB 87Z 8AL 8AO 8FD 8FE 8FG 8FK 8FL 8G5 ABUWG AFKRA ARAPS AZQEC BENPR BEZIV BGLVJ CCPQU DWQXO FRNLG F~G GNUQQ GUQSH HCIFZ JQ2 K60 K6~ K7- L.- L.0 L7M L~C L~D M0C M0N M2O MBDVC P5Z P62 PQBIZ PQBZA PQEST PQQKQ PQUKI PRINS Q9U |
DOI | 10.1007/s00607-016-0490-0 |
DatabaseName | CrossRef Global News & ABI/Inform Professional Trade PRO ProQuest Central (Corporate) Computer and Information Systems Abstracts ABI/INFORM Collection ABI/INFORM Global (PDF only) ProQuest Central (purchase pre-March 2016) ABI/INFORM Collection Computing Database (Alumni Edition) ProQuest Pharma Collection Technology Research Database ProQuest SciTech Collection ProQuest Technology Collection ProQuest Central (Alumni) (purchase pre-March 2016) ABI/INFORM Collection (Alumni Edition) Research Library (Alumni Edition) ProQuest Central (Alumni) ProQuest Central UK/Ireland Advanced Technologies & Aerospace Database (1962 - current) ProQuest Central Essentials AUTh Library subscriptions: ProQuest Central Business Premium Collection Technology Collection ProQuest One Community College ProQuest Central Business Premium Collection (Alumni) ABI/INFORM Global (Corporate) ProQuest Central Student Research Library Prep SciTech Premium Collection (Proquest) (PQ_SDU_P3) ProQuest Computer Science Collection ProQuest Business Collection (Alumni Edition) ProQuest Business Collection Computer Science Database ABI/INFORM Professional Advanced ABI/INFORM Professional Standard Advanced Technologies Database with Aerospace Computer and Information Systems Abstracts Academic Computer and Information Systems Abstracts Professional ABI/INFORM Global (ProQuest) Computing Database ProQuest_Research Library Research Library (Corporate) Advanced Technologies & Aerospace Database ProQuest Advanced Technologies & Aerospace Collection One Business (ProQuest) ProQuest One Business (Alumni) ProQuest One Academic Eastern Edition (DO NOT USE) ProQuest One Academic ProQuest One Academic UKI Edition ProQuest Central China ProQuest Central Basic |
DatabaseTitle | CrossRef ABI/INFORM Global (Corporate) ProQuest Business Collection (Alumni Edition) ProQuest One Business Research Library Prep Computer Science Database ProQuest Central Student Technology Collection Technology Research Database Computer and Information Systems Abstracts – Academic ProQuest Advanced Technologies & Aerospace Collection ProQuest Central Essentials ProQuest Computer Science Collection Computer and Information Systems Abstracts ProQuest Central (Alumni Edition) SciTech Premium Collection ProQuest One Community College Research Library (Alumni Edition) Trade PRO ProQuest Pharma Collection ProQuest Central China ABI/INFORM Complete ProQuest Central Global News & ABI/Inform Professional ABI/INFORM Professional Advanced ABI/INFORM Professional Standard ProQuest Central Korea ProQuest Research Library Advanced Technologies Database with Aerospace ABI/INFORM Complete (Alumni Edition) Advanced Technologies & Aerospace Collection Business Premium Collection ABI/INFORM Global ProQuest Computing ABI/INFORM Global (Alumni Edition) ProQuest Central Basic ProQuest Computing (Alumni Edition) ProQuest One Academic Eastern Edition ProQuest Technology Collection ProQuest SciTech Collection ProQuest Business Collection Computer and Information Systems Abstracts Professional Advanced Technologies & Aerospace Database ProQuest One Academic UKI Edition ProQuest One Business (Alumni) ProQuest One Academic ProQuest Central (Alumni) Business Premium Collection (Alumni) |
DatabaseTitleList | Computer and Information Systems Abstracts ABI/INFORM Global (Corporate) |
Database_xml | – sequence: 1 dbid: 8FG name: ProQuest Technology Collection url: https://search.proquest.com/technologycollection1 sourceTypes: Aggregation Database |
DeliveryMethod | fulltext_linktorsrc |
Discipline | Mathematics Computer Science |
EISSN | 1436-5057 |
EndPage | 349 |
ExternalDocumentID | 4321169011 10_1007_s00607_016_0490_0 |
Genre | Feature |
GroupedDBID | -4Z -59 -5G -BR -EM -Y2 -~C -~X .4S .86 .DC .VR 06D 0R~ 0VY 1N0 1SB 2.D 203 28- 29F 2J2 2JN 2JY 2KG 2KM 2LR 2P1 2VQ 2~H 30V 3V. 4.4 406 408 409 40D 40E 5GY 5QI 5VS 67Z 6NX 6TJ 78A 7WY 8AO 8FE 8FG 8FL 8G5 8TC 8UJ 8VB 95- 95. 95~ 96X AAAVM AABHQ AABYN AAFGU AAHNG AAIAL AAJKR AANZL AAOBN AAPBV AARHV AARTL AATNV AATVU AAUYE AAWCG AAWWR AAYFA AAYIU AAYQN AAYTO ABBBX ABBXA ABDBF ABDZT ABECU ABFGW ABFTD ABFTV ABHLI ABHQN ABJNI ABJOX ABKAS ABKCH ABKTR ABMNI ABMQK ABNWP ABPTK ABQBU ABSXP ABTEG ABTHY ABTKH ABTMW ABULA ABUWG ABWNU ABXPI ACBMV ACBRV ACBXY ACBYP ACGFS ACHSB ACHXU ACIGE ACIPQ ACKNC ACMDZ ACMLO ACOKC ACOMO ACTTH ACVWB ACWMK ADGRI ADHHG ADHIR ADIMF ADINQ ADKNI ADKPE ADMDM ADOXG ADRFC ADTPH ADURQ ADYFF ADZKW AEBTG AEEQQ AEFIE AEFTE AEGAL AEGNC AEJHL AEJRE AEKMD AEMOZ AENEX AEOHA AEPYU AESKC AESTI AETLH AEVLU AEVTX AEXYK AEYWE AFEXP AFFNX AFGCZ AFKRA AFLOW AFNRJ AFQWF AFWTZ AFZKB AGAYW AGDGC AGGBP AGGDS AGJBK AGMZJ AGQMX AGWIL AGWZB AGYKE AHAVH AHBYD AHKAY AHSBF AHYZX AIAKS AIIXL AILAN AIMYW AITGF AJBLW AJDOV AJRNO AJZVZ AKQUC AKVCP ALMA_UNASSIGNED_HOLDINGS ALWAN AMKLP AMXSW AMYLF AMYQR AOCGG ARAPS ARCSS ARMRJ ASPBG AVWKF AXYYD AYJHY AZFZN AZQEC B-. B0M BA0 BBWZM BDATZ BENPR BEZIV BGLVJ BGNMA BKOMP BPHCQ CAG CCPQU COF CS3 CSCUP DDRTE DL5 DNIVK DPUIP DWQXO EAD EAP EBA EBLON EBR EBS EBU ECS EDO EIOEI EJD EMK EPL ESBYG EST ESX FEDTE FERAY FFXSO FIGPU FINBP FNLPD FRNLG FRRFC FSGXE FWDCC GGCAI GGRSB GJIRD GNUQQ GNWQR GQ6 GQ7 GQ8 GROUPED_ABI_INFORM_COMPLETE GUQSH GXS HCIFZ HF~ HG5 HG6 HMJXF HQYDN HRMNR HVGLF HZ~ I09 IHE IJ- IKXTQ ITG ITH ITM IWAJR IXC IZIGR IZQ I~X I~Z J-C J0Z JBSCW JCJTX JZLTJ K1G K60 K6V K6~ K7- KDC KOV KOW LAS LLZTM M0C M0N M2O M4Y MA- MK~ ML~ N2Q N9A NB0 NDZJH NPVJJ NQJWS NU0 O9- O93 O9G O9I O9J OAM P19 P2P P62 P9O PF0 PQBIZ PQQKQ PROAC PT4 PT5 Q2X QOK QOS QWB R4E R89 R9I RHV RIG RNI RNS ROL RPX RSV RZK S16 S1Z S26 S27 S28 S3B SAP SCJ SCLPG SCO SDH SDM SHX SISQX SJYHP SNE SNPRN SNX SOHCF SOJ SPISZ SRMVM SSLCW STPWE SZN T13 T16 TH9 TN5 TSG TSK TSV TUC TUS U2A UG4 UNUBA UOJIU UTJUX UZXMN VC2 VFIZW W23 W48 WK8 YLTOR Z45 Z7R Z7S Z7X Z7Z Z81 Z83 Z88 Z8M Z8N Z8R Z8T Z8U Z8W Z92 ZL0 ZMTXR ~8M ~EX AACDK AAEOY AAJBT AASML AAYXX ABAKF ACAOD ACDTI ACZOJ AEFQL AEMSY AFBBN AGQEE AGRTI AIGIU CITATION H13 PQBZA 0U~ 1-H 7SC 7XB 8AL 8FD 8FK JQ2 L.- L.0 L7M L~C L~D MBDVC PQEST PQUKI PRINS Q9U |
ID | FETCH-LOGICAL-c392t-85c3e573923fad281a566490bb9d62bc4212323e8c422a6fef50c6b6d5b5f6cc3 |
IEDL.DBID | AGYKE |
ISSN | 0010-485X |
IngestDate | Fri Aug 16 11:29:36 EDT 2024 Thu Oct 10 19:06:16 EDT 2024 Thu Sep 12 17:23:36 EDT 2024 Sat Dec 16 11:58:42 EST 2023 |
IsPeerReviewed | true |
IsScholarly | true |
Issue | 4 |
Keywords | Information extraction 68 Computer Science Large datasets Cross-document coreference Resolution 68U15 Text processing; mathematical typography 68-02 Research exposition (monographs, survey articles) |
Language | English |
LinkModel | DirectLink |
MergedId | FETCHMERGED-LOGICAL-c392t-85c3e573923fad281a566490bb9d62bc4212323e8c422a6fef50c6b6d5b5f6cc3 |
Notes | ObjectType-Article-1 SourceType-Scholarly Journals-1 ObjectType-Feature-2 content type line 23 |
PQID | 1880865925 |
PQPubID | 48322 |
PageCount | 37 |
ParticipantIDs | proquest_miscellaneous_1893902701 proquest_journals_1880865925 crossref_primary_10_1007_s00607_016_0490_0 springer_journals_10_1007_s00607_016_0490_0 |
PublicationCentury | 2000 |
PublicationDate | 2017-04-01 |
PublicationDateYYYYMMDD | 2017-04-01 |
PublicationDate_xml | – month: 04 year: 2017 text: 2017-04-01 day: 01 |
PublicationDecade | 2010 |
PublicationPlace | Vienna |
PublicationPlace_xml | – name: Vienna – name: Wien |
PublicationSubtitle | Archives for Scientific Computing |
PublicationTitle | Computing |
PublicationTitleAbbrev | Computing |
PublicationYear | 2017 |
Publisher | Springer Vienna Springer Nature B.V |
Publisher_xml | – name: Springer Vienna – name: Springer Nature B.V |
References | Köpcke, Thor, Rahm (CR34) 2010; 14 CR39 CR38 CR37 CR36 CR33 CR32 CR30 McCallum (CR1) 2005; 3 Weikum, Hoffart, Nakashole, Spaniol, Suchanek, Yosef (CR81) 2012; 35 CR48 CR47 Frakes, Baeza-Yates (CR92) 1992 CR46 CR45 CR44 CR42 Bizer, Heath, Berners-Lee (CR31) 2009; 5 CR41 CR40 Marrero, Sanchez-Cuadrado, Morato, Andreadakis (CR84) 2009; 41 Tasdemir, Merényi (CR88) 2011; 41 Anderberg (CR62) 1973 CR59 CR58 CR57 CR56 CR55 CR54 CR53 Màrquez, Recasens, Sapena (CR29) 2013; 47 CR51 CR50 Karaboga, Ozturk (CR52) 2011; 11 Miller, Fellbaum (CR71) 2007; 41 CR69 CR67 CR66 CR65 CR64 CR63 CR61 Nadeau, Sekine (CR77) 2007; 30 Kolb, Thor, Rahm (CR10) 2012; 5 CR60 Dutta, Weikum (CR4) 2015; 3 Hall, Dowling (CR49) 1980; 12 CR79 CR78 Chen, Ding, Tsai (CR26) 1998; 12 CR76 CR75 CR74 CR73 CR72 CR2 CR3 CR6 CR5 CR8 CR7 CR9 CR89 CR87 CR86 CR85 Dean, Ghemawat (CR14) 2008; 51 CR83 CR80 Ni, Zhang, Qiu, Wang (CR35) 2010; 1 Bagga, Baldwin (CR43) 1998; 1 CR19 CR18 CR17 CR16 CR15 CR13 CR12 CR11 CR99 CR98 CR97 CR96 CR95 CR94 CR93 CR91 CR90 Hachey, Grover, Tobin (CR68) 2012; 18 Miller (CR70) 1995; 38 CR28 CR27 CR25 CR24 CR23 CR22 CR21 CR20 Riddle (CR82) 1984; 9 490_CR11 490_CR99 490_CR98 490_CR97 490_CR96 490_CR15 490_CR13 490_CR12 490_CR19 L Màrquez (490_CR29) 2013; 47 490_CR18 WE Riddle (490_CR82) 1984; 9 490_CR17 490_CR16 A Bagga (490_CR43) 1998; 1 490_CR22 490_CR21 490_CR20 GA Miller (490_CR70) 1995; 38 490_CR25 490_CR24 490_CR23 490_CR28 490_CR27 490_CR76 490_CR75 490_CR74 490_CR79 490_CR78 490_CR80 Y Ni (490_CR35) 2010; 1 490_CR83 490_CR87 490_CR86 490_CR85 490_CR89 490_CR3 K Tasdemir (490_CR88) 2011; 41 490_CR5 490_CR6 490_CR7 M Marrero (490_CR84) 2009; 41 490_CR8 490_CR9 H-H Chen (490_CR26) 1998; 12 490_CR2 490_CR91 490_CR90 490_CR95 490_CR94 490_CR93 490_CR55 490_CR54 490_CR53 (490_CR92) 1992 490_CR59 490_CR58 490_CR57 490_CR56 B Hachey (490_CR68) 2012; 18 GA Miller (490_CR71) 2007; 41 490_CR61 490_CR60 490_CR66 490_CR65 490_CR64 490_CR63 490_CR69 490_CR67 J Dean (490_CR14) 2008; 51 C Bizer (490_CR31) 2009; 5 490_CR73 490_CR72 490_CR33 490_CR32 490_CR30 490_CR37 490_CR36 490_CR39 490_CR38 L Kolb (490_CR10) 2012; 5 490_CR40 490_CR44 MR Anderberg (490_CR62) 1973 490_CR42 490_CR41 490_CR48 PA Hall (490_CR49) 1980; 12 490_CR47 490_CR46 D Karaboga (490_CR52) 2011; 11 490_CR45 S Dutta (490_CR4) 2015; 3 G Weikum (490_CR81) 2012; 35 A McCallum (490_CR1) 2005; 3 H Köpcke (490_CR34) 2010; 14 D Nadeau (490_CR77) 2007; 30 490_CR51 490_CR50 |
References_xml | – ident: CR45 – ident: CR22 – ident: CR97 – ident: CR74 – ident: CR39 – ident: CR16 – ident: CR51 – ident: CR54 – ident: CR80 – ident: CR8 – volume: 1 start-page: 566 year: 2010 end-page: 581 ident: CR35 article-title: Enhancing the open-domain classification of named entity using linked open data publication-title: Int Semantic Web Conf contributor: fullname: Wang – ident: CR25 – ident: CR42 – volume: 12 start-page: 381 issue: 4 year: 1980 end-page: 402 ident: CR49 article-title: Approximate string matching publication-title: ACM Comput Surv doi: 10.1145/356827.356830 contributor: fullname: Dowling – ident: CR19 – volume: 41 start-page: 209 issue: 2 year: 2007 end-page: 214 ident: CR71 article-title: Wordnet then and now publication-title: Lang Resour Eval doi: 10.1007/s10579-007-9044-6 contributor: fullname: Fellbaum – volume: 3 start-page: 48 issue: 9 year: 2005 end-page: 57 ident: CR1 article-title: Information extraction: distilling structured data from unstructured text publication-title: ACM Queue doi: 10.1145/1105664.1105679 contributor: fullname: McCallum – ident: CR11 – ident: CR57 – ident: CR60 – ident: CR36 – ident: CR85 – year: 1973 ident: CR62 publication-title: Cluster analysis for applications contributor: fullname: Anderberg – volume: 3 start-page: 15 year: 2015 end-page: 28 ident: CR4 article-title: Cross-document co-reference resolution using sample-based clustering with knowledge enrichment publication-title: Trans Assoc Comput Linguist contributor: fullname: Weikum – ident: CR5 – volume: 14 start-page: 23 issue: 4 year: 2010 end-page: 31 ident: CR34 article-title: Learning-based approaches for matching web data entities publication-title: IEEE Internet Comput doi: 10.1109/MIC.2010.58 contributor: fullname: Rahm – volume: 1 start-page: 563 year: 1998 end-page: 566 ident: CR43 article-title: Algorithms for scoring coreference chains publication-title: Int Conf Lang Resour Eval Workshop Linguist Coreference contributor: fullname: Baldwin – volume: 41 start-page: 47 year: 2009 end-page: 58 ident: CR84 article-title: Evaluation of named entity extraction systems publication-title: Adv Comput Linguistics contributor: fullname: Andreadakis – volume: 35 start-page: 46 issue: 3 year: 2012 end-page: 64 ident: CR81 article-title: Big data methods for computational linguistics publication-title: IEEE Data Eng Bull contributor: fullname: Yosef – ident: CR18 – ident: CR66 – ident: CR91 – ident: CR47 – ident: CR72 – volume: 30 start-page: 3 issue: 1 year: 2007 end-page: 26 ident: CR77 article-title: A survey of named entity recognition and classification publication-title: Lingvisticae Investigationes doi: 10.1075/li.30.1.03nad contributor: fullname: Sekine – ident: CR89 – ident: CR30 – ident: CR33 – ident: CR6 – ident: CR86 – volume: 47 start-page: 661 issue: 3 year: 2013 end-page: 694 ident: CR29 article-title: Coreference resolution: an empirical study based on semeval-2010 shared task 1 publication-title: Lang Resour Eval doi: 10.1007/s10579-012-9194-z contributor: fullname: Sapena – ident: CR63 – ident: CR27 – ident: CR69 – ident: CR94 – ident: CR44 – volume: 41 start-page: 1039 issue: 4 year: 2011 end-page: 1053 ident: CR88 article-title: A validity index for prototype-based clustering of data sets with complex cluster structures publication-title: IEEE Trans contributor: fullname: Merényi – volume: 11 start-page: 652 issue: 1 year: 2011 end-page: 657 ident: CR52 article-title: A novel clustering approach: artificial bee colony (abc) algorithm publication-title: Appl Soft Comput doi: 10.1016/j.asoc.2009.12.025 contributor: fullname: Ozturk – ident: CR3 – ident: CR38 – ident: CR13 – ident: CR55 – ident: CR83 – ident: CR41 – ident: CR24 – volume: 51 start-page: 107 issue: 1 year: 2008 end-page: 113 ident: CR14 article-title: Mapreduce: simplified data processing on large clusters publication-title: Commun. ACM doi: 10.1145/1327452.1327492 contributor: fullname: Ghemawat – ident: CR93 – ident: CR87 – ident: CR12 – ident: CR61 – ident: CR58 – ident: CR21 – ident: CR46 – ident: CR96 – ident: CR67 – ident: CR75 – ident: CR15 – ident: CR50 – ident: CR9 – volume: 12 start-page: 75 issue: 1 year: 1998 end-page: 85 ident: CR26 article-title: Named entity extraction for information retrieval publication-title: Comput Process Orient Lang contributor: fullname: Tsai – ident: CR32 – volume: 38 start-page: 39 issue: 11 year: 1995 end-page: 41 ident: CR70 article-title: Wordnet: a lexical database for english publication-title: Commun ACM doi: 10.1145/219717.219748 contributor: fullname: Miller – ident: CR78 – ident: CR64 – ident: CR99 – volume: 9 start-page: 21 issue: 2 year: 1984 end-page: 37 ident: CR82 article-title: The magic number eighteen plus or minus three: a study of software technology maturation publication-title: ACM SIGSOFT Softw Eng Note doi: 10.1145/1010925.1010927 contributor: fullname: Riddle – ident: CR95 – ident: CR2 – ident: CR37 – ident: CR53 – volume: 5 start-page: 1878 issue: 12 year: 2012 end-page: 1881 ident: CR10 article-title: Dedoop: efficient deduplication with hadoop publication-title: Proc VLDB Endow doi: 10.14778/2367502.2367527 contributor: fullname: Rahm – volume: 5 start-page: 1 issue: 3 year: 2009 end-page: 22 ident: CR31 article-title: Linked data—the story so far publication-title: Int J Semant Web Inf Syst doi: 10.4018/jswis.2009081901 contributor: fullname: Berners-Lee – year: 1992 ident: CR92 publication-title: Information retrieval: data structures and algorithms contributor: fullname: Baeza-Yates – ident: CR79 – ident: CR56 – ident: CR40 – ident: CR98 – ident: CR23 – ident: CR48 – ident: CR73 – volume: 18 start-page: 21 issue: 1 year: 2012 end-page: 59 ident: CR68 article-title: Datasets for generic relation extraction publication-title: Nat Lang Eng doi: 10.1017/S1351324911000106 contributor: fullname: Tobin – ident: CR65 – ident: CR90 – ident: CR17 – ident: CR7 – ident: CR59 – ident: CR76 – ident: CR28 – ident: CR20 – ident: 490_CR25 – ident: 490_CR90 doi: 10.3115/1613715.1613795 – ident: 490_CR54 – ident: 490_CR48 – ident: 490_CR8 – volume: 41 start-page: 209 issue: 2 year: 2007 ident: 490_CR71 publication-title: Lang Resour Eval doi: 10.1007/s10579-007-9044-6 contributor: fullname: GA Miller – volume: 38 start-page: 39 issue: 11 year: 1995 ident: 490_CR70 publication-title: Commun ACM doi: 10.1145/219717.219748 contributor: fullname: GA Miller – volume: 35 start-page: 46 issue: 3 year: 2012 ident: 490_CR81 publication-title: IEEE Data Eng Bull contributor: fullname: G Weikum – ident: 490_CR85 doi: 10.1109/ICSC.2014.31 – ident: 490_CR5 – ident: 490_CR72 – ident: 490_CR45 – volume: 3 start-page: 15 year: 2015 ident: 490_CR4 publication-title: Trans Assoc Comput Linguist doi: 10.1162/tacl_a_00119 contributor: fullname: S Dutta – volume-title: Information retrieval: data structures and algorithms year: 1992 ident: 490_CR92 – ident: 490_CR27 doi: 10.3115/1620754.1620778 – volume: 5 start-page: 1 issue: 3 year: 2009 ident: 490_CR31 publication-title: Int J Semant Web Inf Syst doi: 10.4018/jswis.2009081901 contributor: fullname: C Bizer – ident: 490_CR65 – ident: 490_CR86 – ident: 490_CR20 doi: 10.1145/1242572.1242667 – ident: 490_CR13 – ident: 490_CR67 doi: 10.3115/1219840.1219885 – ident: 490_CR6 – volume: 5 start-page: 1878 issue: 12 year: 2012 ident: 490_CR10 publication-title: Proc VLDB Endow doi: 10.14778/2367502.2367527 contributor: fullname: L Kolb – ident: 490_CR79 – ident: 490_CR36 doi: 10.3115/1218955.1219031 – ident: 490_CR94 – ident: 490_CR9 doi: 10.3115/1557690.1557767 – ident: 490_CR46 – ident: 490_CR51 doi: 10.3115/1219044.1219066 – ident: 490_CR97 doi: 10.3115/1610075.1610158 – ident: 490_CR42 doi: 10.3115/1219840.1219841 – volume: 41 start-page: 47 year: 2009 ident: 490_CR84 publication-title: Adv Comput Linguistics contributor: fullname: M Marrero – ident: 490_CR37 – ident: 490_CR89 doi: 10.1007/s00453-001-0010-1 – volume: 12 start-page: 75 issue: 1 year: 1998 ident: 490_CR26 publication-title: Comput Process Orient Lang contributor: fullname: H-H Chen – ident: 490_CR58 doi: 10.1145/1007568.1007652 – ident: 490_CR78 doi: 10.1075/bct.19 – ident: 490_CR3 – ident: 490_CR99 – ident: 490_CR50 doi: 10.1109/ITCC.2002.1000354 – ident: 490_CR12 doi: 10.1007/978-3-642-03070-3_52 – volume: 47 start-page: 661 issue: 3 year: 2013 ident: 490_CR29 publication-title: Lang Resour Eval doi: 10.1007/s10579-012-9194-z contributor: fullname: L Màrquez – volume: 12 start-page: 381 issue: 4 year: 1980 ident: 490_CR49 publication-title: ACM Comput Surv doi: 10.1145/356827.356830 contributor: fullname: PA Hall – ident: 490_CR80 – ident: 490_CR74 – ident: 490_CR15 – ident: 490_CR40 doi: 10.1137/1.9781611972795.32 – ident: 490_CR23 doi: 10.3115/1613715.1613756 – ident: 490_CR11 doi: 10.3115/1699571.1699635 – ident: 490_CR53 doi: 10.3115/1218955.1218973 – ident: 490_CR21 doi: 10.3115/1609067.1609072 – ident: 490_CR83 doi: 10.1109/ESEM.2011.36 – ident: 490_CR64 doi: 10.1007/s00778-008-0098-x – ident: 490_CR73 – ident: 490_CR96 – ident: 490_CR18 doi: 10.1145/2488388.2488411 – ident: 490_CR44 – ident: 490_CR16 doi: 10.1007/978-3-319-15350-6_3 – volume-title: Cluster analysis for applications year: 1973 ident: 490_CR62 contributor: fullname: MR Anderberg – ident: 490_CR87 – ident: 490_CR41 – ident: 490_CR32 doi: 10.3115/1220575.1220588 – volume: 9 start-page: 21 issue: 2 year: 1984 ident: 490_CR82 publication-title: ACM SIGSOFT Softw Eng Note doi: 10.1145/1010925.1010927 contributor: fullname: WE Riddle – ident: 490_CR91 doi: 10.1145/1376616.1376726 – ident: 490_CR55 – volume: 1 start-page: 563 year: 1998 ident: 490_CR43 publication-title: Int Conf Lang Resour Eval Workshop Linguist Coreference contributor: fullname: A Bagga – ident: 490_CR76 – ident: 490_CR60 doi: 10.3115/1072399.1072405 – ident: 490_CR66 doi: 10.1109/ICDE.2002.994694 – ident: 490_CR39 doi: 10.1109/ICDE.2011.5767865 – volume: 51 start-page: 107 issue: 1 year: 2008 ident: 490_CR14 publication-title: Commun. ACM doi: 10.1145/1327452.1327492 contributor: fullname: J Dean – ident: 490_CR24 – ident: 490_CR38 – ident: 490_CR61 – volume: 30 start-page: 3 issue: 1 year: 2007 ident: 490_CR77 publication-title: Lingvisticae Investigationes doi: 10.1075/li.30.1.03nad contributor: fullname: D Nadeau – ident: 490_CR93 – ident: 490_CR69 – ident: 490_CR59 doi: 10.1007/11573036_36 – ident: 490_CR17 – ident: 490_CR30 – volume: 14 start-page: 23 issue: 4 year: 2010 ident: 490_CR34 publication-title: IEEE Internet Comput doi: 10.1109/MIC.2010.58 contributor: fullname: H Köpcke – ident: 490_CR75 – ident: 490_CR98 – ident: 490_CR2 – ident: 490_CR47 doi: 10.1186/1471-2105-9-S9-S11 – volume: 41 start-page: 1039 issue: 4 year: 2011 ident: 490_CR88 publication-title: IEEE Trans contributor: fullname: K Tasdemir – volume: 11 start-page: 652 issue: 1 year: 2011 ident: 490_CR52 publication-title: Appl Soft Comput doi: 10.1016/j.asoc.2009.12.025 contributor: fullname: D Karaboga – ident: 490_CR63 doi: 10.1007/978-3-540-76298-0_52 – volume: 18 start-page: 21 issue: 1 year: 2012 ident: 490_CR68 publication-title: Nat Lang Eng doi: 10.1017/S1351324911000106 contributor: fullname: B Hachey – volume: 3 start-page: 48 issue: 9 year: 2005 ident: 490_CR1 publication-title: ACM Queue doi: 10.1145/1105664.1105679 contributor: fullname: A McCallum – ident: 490_CR33 – ident: 490_CR56 – ident: 490_CR19 doi: 10.1145/1376616.1376746 – ident: 490_CR57 doi: 10.3115/1219840.1219917 – ident: 490_CR7 – ident: 490_CR95 – ident: 490_CR22 – volume: 1 start-page: 566 year: 2010 ident: 490_CR35 publication-title: Int Semantic Web Conf contributor: fullname: Y Ni – ident: 490_CR28 doi: 10.3115/1220575.1220579 |
SSID | ssj0002389 |
Score | 2.346251 |
Snippet | Information extraction (IE) is the task of automatically extracting structured information from unstructured/semi-structured machine-readable documents. Among... |
SourceID | proquest crossref springer |
SourceType | Aggregation Database Publisher |
StartPage | 313 |
SubjectTerms | Algorithms Analysis Artificial Intelligence Comparative analysis Computer Appl. in Administrative Data Processing Computer Communication Networks Computer Science Datasets Information retrieval Information sources Information systems Information Systems Applications (incl.Internet) Intelligence Knowledge acquisition Literature reviews Management of crises Mathematical models Natural language Public sector Real time Recommendations Searching Software Engineering Studies Systematic review Tasks Taxonomy |
SummonAdditionalLinks | – databaseName: ProQuest Technology Collection dbid: 8FG link: http://utb.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwfV07T8MwED5BWWDgUUAECjISE8giL6fxhBCiVEhlolK3yHZsFpQU0v5_zs4LkGBzlMRO7uzzZ9_5O4AriSA5QDtHZZTGNJapoSlTEQ24CfKUhcY4kqTZSzKdx88Ltmg23KomrLK1ic5Q56Wye-S3ljcstT5Adrf8oDZrlPWuNik0NmErsEx49qT45KmzxDgd1fAXbU2cskXr1fQdiWjigi5xPR1zn_o_56UebP7yj7ppZ7IPuw1eJPe1gg9gQxdD2GtzMZBmaA5hZ9bxr1aH8HZPeopmUh9PIaLIierJvvG65iMhpSHuoyhKYW13C4nq04_g2233JHW66crVtCrL9-oI5pPH14cpbVIqUIVAaOX0oNkYy5EReZgGAuEc_r-UPE9Cqax_OAojnWIpFInRhvkqkUnOJDOJUtExDIqy0CdAEmEU92OF-CyMBa6kha9CLbgyfBzlRntw3Qo0W9bMGVnHkeykn9noMiv9zPdg1Io8awZRlfUq9-Cyu43d3_o0RKHLtX2GRxyX1n7gwU2rqm9V_NXg6f8NnsF2aOdvF6IzgsHqc63PEX2s5IXrYl8ssddp priority: 102 providerName: ProQuest |
Title | A systematic review and comparative analysis of cross-document coreference resolution methods and tools |
URI | https://link.springer.com/article/10.1007/s00607-016-0490-0 https://www.proquest.com/docview/1880865925 https://search.proquest.com/docview/1893902701 |
Volume | 99 |
hasFullText | 1 |
inHoldings | 1 |
isFullTextHit | |
isPrint | |
link | http://utb.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwlV3NT9swFH9i5bIdxlZAhLHKSDsNpUqcOHWOHWpBQ60mRKXuFNmOzaEomZb0sr-eZ-ejMLYDpziJ48h-_vg9v-ffA_giESSHOM_5MuKxH0tufM5U5IepCXPOqDGOJGmxTK5X8fc1W-8B7bcuis24s0i6ibo_62aZQ6yXJCrAcRr4qKbvMxuVegD706ufN7N-_sVFqAG9OMPEnK07W-a_Cnm-Gu0g5l9WUbfYzA-aA4CV4yi0Piab8baWY_XnJYPjK-rxAd632JNMm87yEfZ0MYSDLq4DaYf5EN4tei7X6hDup2RH90yaoy5EFDlRO-JwvG-4TUhpiKuqn5dqa3ceidqFMsGvu65OmtDVlSupLsuH6ghW89nd5bXfhmfwFYKq2slUswmmIyNyykOB0BCrJGWaJ1Qqa2uOaKQ5pqhIjDYsUIlMciaZSZSKjmFQlIU-AZIIo9IgVoj1aCxQKxeBolqkyqSTKDfag6-dmLJfDQtH1vMtuwbNrKeabdAs8OCsE2TWDsgqs7Rz3JqQmQfn_WscStY-Igpdbm2eNEpRTQ9CDy464T0p4n8_PH1V7k_wllpo4Lx_zmBQ_97qzwhsajmCN3x-NWr7M16_zZY_bvHpik4fAVg08wM |
link.rule.ids | 315,786,790,12792,21416,27957,27958,33408,33409,33779,33780,41116,41558,42185,42627,43635,43840,52146,52269,74392,74659 |
linkProvider | Springer Nature |
linkToHtml | http://utb.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwfV1LT4QwEJ7oelAPvo3rsyaeNI3lUYSTUbOb9bEbYzTZG2lL68WAyu7_d1pgURO9QYAWZtqZr53hG4ATiSDZQztHZRCHNJSxoTFXAfUS42Ux941xJEnDUTR4Ce_GfFxvuJV1WmVjE52hzgpl98jPLW9YbGOA_PL9g9qqUTa6WpfQmIeFEPtiHVi47o0en2a2GB1SBYDR2oQxHzdxTeZoRCOXdokr6jBhlP30TC3c_BUhdY6nvwYrNWIkV5WK12FO5xuw2lRjIPXk3IDl4YyBtdyE1yvSkjST6gcVIvKMqJbuG88rRhJSGOJeiqIcpna_kKi2AAk-3QxQUhWcLl1Lk6J4K7fgpd97vhnQuqgCVQiFJk4Tml_gcWBE5seeQECH3y9lkkW-VDZCHPiBjvHIF5HRhjMVySjjkptIqWAbOnmR6x0gkTAqYaFChOaHAtfSgilfi0SZ5CLIjO7CaSPQ9L3izkhnLMlO-qnNL7PST1kX9huRp_U0KtNW6V04nl3GCWCjGiLXxdTekwQJLq6Z14WzRlXfmvirw93_OzyCxcHz8CF9uB3d78GSb725S9jZh87kc6oPEItM5GE94L4AGr3buQ |
linkToPdf | http://utb.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwfV1LT8MwDLZ4SAgOPAaI8QwSJ1BEXynpCU3AeE8cmLRblaQJF9QOuv1_nLRdAQlurdomrZ3YX2L3M8CJRJDso52jMuQRjSQ3lDMVUj8xfsZZYIwjSXoexHfD6GHERnX-U1mnVTY20RnqrFB2j_zc8oZxGwNk56ZOi3i57l-OP6itIGUjrXU5jXlYtCDbVjPg_duZVUbXVEFhtDsRZ6Mmwuk5QtHYJWDi2jpKPOr99FEt8PwVK3UuqL8OqzV2JL1K2Rswp_MOrDV1GUg9TTuw8jzjYi034a1HWrpmUv2qQkSeEdUSf-N5xU1CCkPcS1GUyNTuHBLVliLBp5uhSqrS06VraVIU7-UWDPs3r1d3tC6vQBWKZ-J0otkFHodGZAH3BUI7_H4pkywOpLKx4jAINcejQMRGG-apWMYZk8zESoXbsJAXud4BEgujEi9SiNWCSOCqWngq0CJRJrkIM6O7cNoINB1XLBrpjC_ZST-1mWZW-qnXhf1G5Gk9ocq0VX8XjmeXcSrY-IbIdTG19yRhgstsz-_CWaOqb0381eHu_x0ewRKOtPTpfvC4B8uBdesuc2cfFiafU32AoGQiD91o-wKbFN6I |
openUrl | ctx_ver=Z39.88-2004&ctx_enc=info%3Aofi%2Fenc%3AUTF-8&rfr_id=info%3Asid%2Fsummon.serialssolutions.com&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Ajournal&rft.genre=article&rft.atitle=A+systematic+review+and+comparative+analysis+of+cross-document+coreference+resolution+methods+and+tools&rft.jtitle=Computing&rft.au=Beheshti%2C+Seyed-Mehdi-Reza&rft.au=Benatallah%2C+Boualem&rft.au=Venugopal%2C+Srikumar&rft.au=Ryu%2C+Seung+Hwan&rft.date=2017-04-01&rft.pub=Springer+Vienna&rft.issn=0010-485X&rft.eissn=1436-5057&rft.volume=99&rft.issue=4&rft.spage=313&rft.epage=349&rft_id=info:doi/10.1007%2Fs00607-016-0490-0&rft.externalDocID=10_1007_s00607_016_0490_0 |
thumbnail_l | http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/lc.gif&issn=0010-485X&client=summon |
thumbnail_m | http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/mc.gif&issn=0010-485X&client=summon |
thumbnail_s | http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/sc.gif&issn=0010-485X&client=summon |