I beg to differ: how disagreement is handled in the annotation of legal machine learning data sets

Legal documents, like contracts or laws, are subject to interpretation. Different people can have different interpretations of the very same document. Large parts of judicial branches all over the world are concerned with settling disagreements that arise, in part, from these different interpretatio...

Full description

Saved in:

Bibliographic Details
Published in	Artificial intelligence and law Vol. 32; no. 3; pp. 839 - 862
Main Author	Braun, Daniel
Format	Journal Article
Language	English
Published	Dordrecht Springer Netherlands 01.09.2024 Springer Springer Nature B.V
Subjects	Annotations Annotations and citations (Law) Artificial Intelligence Computer Science Data analysis Datasets Documents Evaluation Information Storage and Retrieval Intellectual Property IT Law Legal Aspects of Computing Legislation Machine learning Media Law Original Research Peer review Philosophy of Law Annotator agreement Data annotation Legal corpora
Online Access	Get full text
ISSN	0924-8463 1572-8382
DOI	10.1007/s10506-023-09369-4

Cover

Loading…

Abstract	Legal documents, like contracts or laws, are subject to interpretation. Different people can have different interpretations of the very same document. Large parts of judicial branches all over the world are concerned with settling disagreements that arise, in part, from these different interpretations. In this context, it only seems natural that during the annotation of legal machine learning data sets, disagreement, how to report it, and how to handle it should play an important role. This article presents an analysis of the current state-of-the-art in the annotation of legal machine learning data sets. The results of the analysis show that all of the analysed data sets remove all traces of disagreement, instead of trying to utilise the information that might be contained in conflicting annotations. Additionally, the publications introducing the data sets often do provide little information about the process that derives the “gold standard” from the initial annotations, often making it difficult to judge the reliability of the annotation process. Based on the state-of-the-art, the article provides easily implementable suggestions on how to improve the handling and reporting of disagreement in the annotation of legal machine learning data sets.
AbstractList	Legal documents, like contracts or laws, are subject to interpretation. Different people can have different interpretations of the very same document. Large parts of judicial branches all over the world are concerned with settling disagreements that arise, in part, from these different interpretations. In this context, it only seems natural that during the annotation of legal machine learning data sets, disagreement, how to report it, and how to handle it should play an important role. This article presents an analysis of the current state-of-the-art in the annotation of legal machine learning data sets. The results of the analysis show that all of the analysed data sets remove all traces of disagreement, instead of trying to utilise the information that might be contained in conflicting annotations. Additionally, the publications introducing the data sets often do provide little information about the process that derives the “gold standard” from the initial annotations, often making it difficult to judge the reliability of the annotation process. Based on the state-of-the-art, the article provides easily implementable suggestions on how to improve the handling and reporting of disagreement in the annotation of legal machine learning data sets.
Audience	Professional
Author	Braun, Daniel
Author_xml	– sequence: 1 givenname: Daniel orcidid: 0000-0001-8120-3368 surname: Braun fullname: Braun, Daniel email: d.braun@utwente.nl organization: Department of High-Tech Business and Entrepreneurship, University of Twente
BookMark	eNp9kU9rFjEQxoNU8G31C3gKeN46-bNJ1lspWguFXuo5ZLOTfVP2TWqSIn57oysIHsocMhme32Qmzzk5SzkhIe8ZXDIA_bEyGEENwMUAk1DTIF-RAxs1H4ww_IwcYOJyMFKJN-S81kcAmNQkDmS-pTOutGW6xBCwfKLH_KPn1a0F8YSp0Vjp0aVlw4XGRNsRqUspN9diTjQHuuHqNnpy_hgT9psrKaaVLq45WrHVt-R1cFvFd3_PC_Lty-eH66_D3f3N7fXV3eAl8DYw4RWflZ-MVzNDtUhhRlTz6IQC8POsQUmQ4zIyp80YgtbSB49Bz5p5wcUF-bD3fSr5-zPWZh_zc0n9SSvAKDlxpkRXXe6qPjTamEJuxfkeC56i778aYq9fGRCSK2l0B8wO-JJrLRisj_vyHYybZWB_W2B3C2y3wP6xwMqO8v_QpxJPrvx8GRI7VLs4rVj-rfEC9QuTJJo2
CitedBy_id	crossref_primary_10_1007_s10506_024_09423_9
Cites_doi	10.1145/3322640.3326736 10.18653/v1/2022.semeval-1.42 10.1037/h0031619 10.18653/v1/2021.law-1.14 10.1145/3086512.3086515 10.1145/3458723 10.5220/0010187305150521 10.1016/j.csi.2011.06.002 10.18653/v1/2021.nllp-1.1 10.18653/v1/2022.naacl-main.13 10.1609/hcomp.v8i1.7473 10.1007/s10506-019-09243-2 10.1145/3383583.3398616 10.1145/3209978.3210015 10.3115/1611628.1611630 10.18653/v1/2020.coling-main.598 10.1037/h0026256 10.1109/ICOA.2019.8727617 10.1007/978-3-030-32381-3_36 10.1162/coli.07-034-R2 10.18653/v1/P16-1126 10.18653/v1/2022.semeval-1.73 10.1007/978-3-031-08974-9_54 10.1016/j.jss.2022.111343 10.5040/9781509932771.ch-001 10.1145/3379597.3387473 10.2307/2529310 10.1016/j.ins.2020.09.049 10.18653/v1/2021.nlp4posimpact-1.10 10.7717/peerj-cs.134 10.1145/3209978.3210161 10.18653/v1/2022.ecnlp-1.23 10.18653/v1/W19-2201 10.1162/tacl_a_00449 10.18653/v1/2022.acl-long.468 10.18653/v1/D15-1035 10.1007/978-94-024-0881-2_11 10.18653/v1/2022.acl-long.297 10.1016/j.esp.2016.10.001 10.18653/v1/2020.findings-emnlp.380 10.48550/arXiv.2208.06178 10.1609/aaai.v34i05.6519
ContentType	Journal Article
Copyright	The Author(s) 2023 COPYRIGHT 2024 Springer The Author(s) 2023. This work is published under http://creativecommons.org/licenses/by/4.0/ (the “License”). Notwithstanding the ProQuest Terms and Conditions, you may use this content in accordance with the terms of the License.
Copyright_xml	– notice: The Author(s) 2023 – notice: COPYRIGHT 2024 Springer – notice: The Author(s) 2023. This work is published under http://creativecommons.org/licenses/by/4.0/ (the “License”). Notwithstanding the ProQuest Terms and Conditions, you may use this content in accordance with the terms of the License.
DBID	C6C AAYXX CITATION 3V. 7SC 7XB 8AL 8FD 8FE 8FG 8FK 8G5 ABUWG AFKRA ALSLI ARAPS AZQEC BENPR BGLVJ CCPQU CNYFK DWQXO E3H F2A GNUQQ GUQSH HCIFZ JQ2 K7- L7M L~C L~D M0N M1O M2O MBDVC P5Z P62 PADUT PHGZM PHGZT PKEHL PQEST PQGLB PQQKQ PQUKI PRINS PRQQA Q9U
DOI	10.1007/s10506-023-09369-4
DatabaseName	Springer Nature OA Free Journals CrossRef ProQuest Central (Corporate) Computer and Information Systems Abstracts ProQuest Central (purchase pre-March 2016) Computing Database (Alumni Edition) Technology Research Database ProQuest SciTech Collection ProQuest Technology Collection ProQuest Central (Alumni) (purchase pre-March 2016) Research Library (Alumni) ProQuest Central ProQuest Central UK/Ireland Social Science Premium Collection Advanced Technologies & Aerospace Collection ProQuest Central Essentials ProQuest Central Technology Collection ProQuest One Community College Library & Information Science Collection ProQuest Central Library & Information Sciences Abstracts (LISA) Library & Information Science Abstracts (LISA) ProQuest Central Student ProQuest Research Library SciTech Premium Collection ProQuest Computer Science Collection Computer Science Database ProQuest Advanced Technologies Database with Aerospace Computer and Information Systems Abstracts Academic Computer and Information Systems Abstracts Professional Computing Database Library Science Database Research Library Research Library (Corporate) Advanced Technologies & Aerospace Database ProQuest Advanced Technologies & Aerospace Collection Research Library China ProQuest Central Premium ProQuest One Academic (New) ProQuest One Academic Middle East (New) ProQuest One Academic Eastern Edition (DO NOT USE) ProQuest One Applied & Life Sciences ProQuest One Academic ProQuest One Academic UKI Edition ProQuest Central China ProQuest One Social Sciences ProQuest Central Basic
DatabaseTitle	CrossRef Research Library Prep Computer Science Database ProQuest Central Student Technology Collection Technology Research Database Computer and Information Systems Abstracts – Academic ProQuest One Academic Middle East (New) Library and Information Science Abstracts (LISA) ProQuest Advanced Technologies & Aerospace Collection ProQuest Central Essentials ProQuest Computer Science Collection Computer and Information Systems Abstracts ProQuest Central (Alumni Edition) SciTech Premium Collection ProQuest One Community College Research Library (Alumni Edition) ProQuest Central China ProQuest Central ProQuest One Applied & Life Sciences ProQuest Library Science ProQuest Central Korea Library & Information Science Collection ProQuest Research Library Research Library China ProQuest Central (New) Advanced Technologies Database with Aerospace Advanced Technologies & Aerospace Collection Social Science Premium Collection ProQuest Computing ProQuest One Social Sciences ProQuest Central Basic ProQuest Computing (Alumni Edition) ProQuest One Academic Eastern Edition ProQuest Technology Collection ProQuest SciTech Collection Computer and Information Systems Abstracts Professional Advanced Technologies & Aerospace Database ProQuest One Academic UKI Edition ProQuest One Academic ProQuest One Academic (New) ProQuest Central (Alumni)
DatabaseTitleList	Research Library Prep CrossRef
Database_xml	– sequence: 1 dbid: C6C name: Springer Nature OA Free Journals url: http://www.springeropen.com/ sourceTypes: Publisher – sequence: 2 dbid: 8FG name: ProQuest Technology Collection url: https://search.proquest.com/technologycollection1 sourceTypes: Aggregation Database
DeliveryMethod	fulltext_linktorsrc
Discipline	Law Computer Science
EISSN	1572-8382
EndPage	862
ExternalDocumentID	A803426487 10_1007_s10506_023_09369_4
GroupedDBID	-4Z -59 -5G -BR -EM -Y2 -~C .4L .4S .86 .DC .VR 06D 0R~ 0VY 1N0 1SB 2.D 203 23N 28- 2J2 2JN 2JY 2KG 2LR 2P1 2VQ 2~H 30V 3V. 4.4 406 408 409 40D 40E 5GY 5QI 5VS 67Z 6NX 77K 78A 8FE 8FG 8G5 8TC 8UJ 95- 95. 95~ 96X AAAVM AABHQ AACDK AAHNG AAIAL AAJBT AAJKR AANZL AAOBN AARHV AARTL AASML AATNV AATVU AAUYE AAWCG AAYIU AAYQN AAYTO AAYZH ABAKF ABBBX ABBXA ABDZT ABECU ABFTD ABFTV ABHLI ABHQN ABJNI ABJOX ABKCH ABKTR ABMNI ABMQK ABNWP ABQBU ABQSL ABSXP ABTAH ABTEG ABTHY ABTKH ABTMW ABULA ABUWG ABWNU ABXPI ACAOD ACDTI ACGFS ACHSB ACHXU ACKNC ACMDZ ACMLO ACOKC ACOMO ACPIV ACYUM ACZOJ ADHHG ADHIR ADIMF ADINQ ADKNI ADKPE ADMLS ADRFC ADTPH ADUOI ADURQ ADYFF ADZKW AEBTG AEFIE AEFQL AEGAL AEGNC AEJHL AEJRE AEKMD AEMSY AEOHA AEPYU AESKC AETLH AEVLU AEXYK AFBBN AFEXP AFGCZ AFKRA AFLOW AFQWF AFWTZ AFZKB AGAYW AGDGC AGGDS AGJBK AGMZJ AGQEE AGQMX AGRTI AGWIL AGWZB AGYKE AHAVH AHBYD AHKAY AHSBF AHYZX AIAKS AIGIU AIIXL AILAN AITGF AJBLW AJRNO AJZVZ ALMA_UNASSIGNED_HOLDINGS ALSLI ALWAN AMKLP AMXSW AMYLF AMYQR AOCGG ARAPS ARCSS ARMRJ ASPBG AVWKF AXYYD AYJHY AZFZN AZQEC B-. BA0 BBWZM BDATZ BENPR BGLVJ BGNMA BPHCQ BSONS C6C CAG CCPQU CNYFK COF CS3 CSCUP DDRTE DL5 DNIVK DPUIP DWQXO EBLON EBS EDO EIOEI EJD ESBYG FEDTE FERAY FFXSO FIGPU FINBP FNLPD FRRFC FSGXE FWDCC GGCAI GGRSB GJIRD GNUQQ GNWQR GQ6 GQ7 GQ8 GUQSH GXS HCIFZ HF~ HG5 HG6 HISYW HMJXF HQYDN HRMNR HVGLF HZ~ I09 IAO ICD IHE IJ- IKXTQ ILT ITC ITM IWAJR IXC IZIGR IZQ I~X I~Z J-C J0Z JBSCW JCJTX JZLTJ K6V K7- KDC KOV KOW LAK LLZTM M0N M1O M2O M4Y MA- MK~ N2Q NDZJH NPVJJ NQJWS NU0 O-J O9- O93 O9G O9I O9J OAM OVD P19 P2P P62 P9O PADUT PF0 PQQKQ PROAC PT4 PT5 Q2X QF4 QN5 QN7 QOK QOS R-Y R4E R89 R9I RHO RHV RNI RNS ROL RPX RSV RZC RZD RZK S16 S1Z S26 S27 S28 S3B SAP SCJ SCLPG SCO SDH SDM SHX SISQX SJYHP SNE SNPRN SNX SOHCF SOJ SPISZ SRMVM SSLCW STPWE SZN T13 T16 TEORI TSG TSK TSV TUC TUS U2A UG4 UOJIU UTJUX UZXMN VC2 VFIZW W23 W48 WK6 WK8 YLTOR Z45 Z7X Z81 Z83 Z84 Z88 Z8U Z8W Z8Y Z92 ZMTXR ZY4 ~A9 ~EX AAPKM AAYXX ABBRH ABDBE ABFSG ACSTC ADHKG ADKFA AEZWR AFDZB AFHIU AFOHR AGQPQ AHPBZ AHWEU AIXLP ATHPR AYFIA CITATION PHGZM PHGZT 77I ABRTQ 7SC 7XB 8AL 8FD 8FK E3H F2A JQ2 L7M L~C L~D MBDVC PKEHL PQEST PQGLB PQUKI PRINS PRQQA Q9U
ID	FETCH-LOGICAL-c402t-13c62b6c98c6b1e6d4385e6b5a3600cbb7064045d51a785ff774cfcef7b71c323
IEDL.DBID	AGYKE
ISSN	0924-8463
IngestDate	Tue Sep 02 03:25:44 EDT 2025 Tue Sep 02 03:59:22 EDT 2025 Tue Jul 01 03:07:33 EDT 2025 Thu Apr 24 23:11:54 EDT 2025 Fri Feb 21 02:39:29 EST 2025
IsDoiOpenAccess	true
IsOpenAccess	true
IsPeerReviewed	true
IsScholarly	true
Issue	3
Keywords	Annotator agreement Data annotation Legal corpora
Language	English
LinkModel	DirectLink
MergedId	FETCHMERGED-LOGICAL-c402t-13c62b6c98c6b1e6d4385e6b5a3600cbb7064045d51a785ff774cfcef7b71c323
Notes	ObjectType-Article-1 SourceType-Scholarly Journals-1 ObjectType-Feature-2 content type line 14
ORCID	0000-0001-8120-3368
OpenAccessLink	https://proxy.k.utb.cz/login?url=https://link.springer.com/10.1007/s10506-023-09369-4
PQID	3086492163
PQPubID	30392
PageCount	24
ParticipantIDs	proquest_journals_3086492163 gale_infotracacademiconefile_A803426487 crossref_citationtrail_10_1007_s10506_023_09369_4 crossref_primary_10_1007_s10506_023_09369_4 springer_journals_10_1007_s10506_023_09369_4
ProviderPackageCode	CITATION AAYXX
PublicationCentury	2000
PublicationDate	20240900 2024-09-00 20240901
PublicationDateYYYYMMDD	2024-09-01
PublicationDate_xml	– month: 9 year: 2024 text: 20240900
PublicationDecade	2020
PublicationPlace	Dordrecht
PublicationPlace_xml	– name: Dordrecht
PublicationTitle	Artificial intelligence and law
PublicationTitleAbbrev	Artif Intell Law
PublicationYear	2024
Publisher	Springer Netherlands Springer Springer Nature B.V
Publisher_xml	– name: Springer Netherlands – name: Springer – name: Springer Nature B.V
References	Braun D, Matthes F (2022) Clause topic classification in German and English standard form contracts. In: Proceedings of the fifth workshop on e-commerce and NLP (ECNLP 5). Association for Computational Linguistics, Dublin, Ireland, pp 199–209. https://doi.org/10.18653/v1/2022.ecnlp-1.23 Šavelka J, Ashley KD (2018) Segmenting us court decisions into functional and issue specific parts. In: Legal knowledge and information systems. IOS Press, pp 111–120 Manor L, Li JJ (2019) Plain English summarization of contracts. In: Proceedings of the natural legal language processing workshop 2019. Association for Computational Linguistics, Minneapolis, Minnesota, pp 1–11. https://doi.org/10.18653/v1/W19-2201, https://aclanthology.org/W19-2201 Ovesdotter Alm C (2011) Subjective natural language problems: motivations, applications, characterizations, and implications. In: Proceedings of the 49th annual meeting of the Association for Computational Linguistics: Human Language Technologies. Association for Computational Linguistics, Portland, Oregon, USA, pp 107–112. https://aclanthology.org/P11-2019 ZimmeckSStoryPSmullenDMaps: scaling privacy compliance analysis to a million appsProc Priv Enhanc Technol2019201966 LippiMPałkaPContissaGClaudette: an automated detector of potentially unfair clauses in online terms of serviceArtif Intell Law201927211713910.1007/s10506-019-09243-2 Xiao C, Zhong H, Sun Y (2021) Must-read papers on legal intelligence. Tech. rep., Tsinghua University. https://github.com/thunlp/LegalPapers LandisJRKochGGThe measurement of observer agreement for categorical dataBiometrics19773315917410.2307/2529310 Poudyal P, Savelka J, Ieven A et al. (2020) ECHR: legal corpus for argument mining. In: Proceedings of the 7th workshop on argument mining. Association for Computational Linguistics, Online, pp 67–75. https://aclanthology.org/2020.argmining-1.8 DavaniAMDíazMPrabhakaranVDealing with disagreements: looking beyond the majority vote in subjective annotationsTrans Assoc Comput Linguist2022109211010.1162/tacl_a_00449 Tiwari A, Kalamkar P, Agarwal A et al. (2022) Must-read papers on legal intelligence. Tech. rep., OpenNyAI. https://github.com/Legal-NLP-EkStep/rhetorical-role-baseline ArtsteinRInter-annotator agreement2017DordrechtSpringer29731310.1007/978-94-024-0881-2_11 DuanXWangBWangZSunMHuangXJiHCJRC: a reliable human-annotated benchmark dataset for Chinese judicial reading comprehensionChinese computational linguistics2019ChamSpringer43945110.1007/978-3-030-32381-3_36 Hendrycks D, Burns C, Chen A et al. (2021) CUAD: an expert-annotated NLP dataset for legal contract review. CoRR arxiv:2103.06268 Urchs S, Mitrović J, Granitzer M (2021) Design and implementation of German legal decision corpora. In: Proceedings of the 13th international conference on agents and artificial intelligence—volume 2: ICAART, INSTICC. SciTePress, pp 515–521. https://doi.org/10.5220/0010187305150521 Klemen M, Robnik-Šikonja M (2022) ULFRI at SemEval-2022 task 4: leveraging uncertainty and additional knowledge for patronizing and condescending language detection. In: Proceedings of the 16th international workshop on semantic evaluation (SemEval-2022). Association for Computational Linguistics, Seattle, United States, pp 525–532. https://doi.org/10.18653/v1/2022.semeval-1.73 Prabhakaran V, Mostafazadeh Davani A, Diaz M (2021) On releasing annotator-level labels and information in datasets. In: Proceedings of the Joint 15th linguistic annotation workshop (LAW) and 3rd designing meaning representations (DMR) workshop. Association for Computational Linguistics, Punta Cana, Dominican Republic, pp 133–138. https://doi.org/10.18653/v1/2021.law-1.14 FleissJLMeasuring nominal scale agreement among many ratersPsychol Bull197176537810.1037/h0031619 Rottger P, Vidgen B, Hovy D et al. (2022) Two contrasting data annotation paradigms for subjective NLP tasks. In: Proceedings of the 2022 conference of the North American chapter of the Association for Computational Linguistics: Human Language Technologies. Association for Computational Linguistics, Seattle, United States, pp 175–190. https://doi.org/10.18653/v1/2022.naacl-main.13 Akhtar S, Basile V, Patti V (2020) Modeling annotator perspective and polarized opinions to improve hate speech detection. In: Proceedings of the AAAI conference on human computation and crowdsourcing, vol 8, no 1, pp 151–154. https://doi.org/10.1609/hcomp.v8i1.7473 Glaser I, Scepankova E, Matthes F (2018) Classifying semantic types of legal sentences: portability of machine learning models. In: Legal knowledge and information systems. IOS Press, pp 61–70 Ostendorff M, Blume T, Ostendorff S (2020) Towards an open platform for legal information. In: Proceedings of the ACM/IEEE joint conference on digital libraries in 2020. Association for Computing Machinery, New York, NY, USA, JCDL ’20, pp 385–388. https://doi.org/10.1145/3383583.3398616 Zhong H, Xiao C, Tu C et al. (2020) JEC-QA: a legal-domain question answering dataset. In: Proceedings of the AAAI conference on artificial intelligence, vol 34, no. 05, pp 9701–9708. https://doi.org/10.1609/aaai.v34i05.6519 Drawzeski K, Galassi A, Jablonowska A et al. (2021) A corpus for multilingual analysis of online terms of service. In: Proceedings of the natural legal language processing workshop 2021. Association for Computational Linguistics, Punta Cana, Dominican Republic, pp 1–8. https://doi.org/10.18653/v1/2021.nllp-1.1 Chan B, Schweter S, Möller T (2020) German’s next language model. In: Proceedings of the 28th international conference on computational linguistics. International Committee on Computational Linguistics, Barcelona, Spain (Online), pp 6788–6796. https://doi.org/10.18653/v1/2020.coling-main.598 Tuggener D, von Däniken P, Peetz T et al. (2020) LEDGAR: a large-scale multi-label corpus for text classification of legal provisions in contracts. In: Proceedings of the twelfth language resources and evaluation conference. European Language Resources Association, Marseille, France, pp 1235–1241. https://aclanthology.org/2020.lrec-1.155 Beigman Klebanov B, Beigman E, Diermeier D (2008) Analyzing disagreements. In: Coling 2008: proceedings of the workshop on human judgements in computational linguistics. Coling 2008 Organizing Committee, Manchester, UK, pp 2–7. https://aclanthology.org/W08-1202 Gonzalez D, Zimmermann T, Nagappan N (2020) The state of the ML-universe: 10 years of artificial intelligence & machine learning software development on GitHub. In: Proceedings of the 17th international conference on mining software repositories. Association for Computing Machinery, New York, NY, USA, MSR ’20, pp 431–442. https://doi.org/10.1145/3379597.3387473 Locke D, Zuccon G (2018) A test collection for evaluating legal case law search. In: The 41st international ACM SIGIR conference on research & development in information retrieval. Association for Computing Machinery, New York, NY, USA, SIGIR ’18, pp 1261–1264. https://doi.org/10.1145/3209978.3210161 Zahidi Y, El Younoussi Y, Azroumahli C (2019) Comparative study of the most useful Arabic-supporting natural language processing and deep learning libraries. In: 2019 5th international conference on optimization and applications (ICOA), pp 1–10. https://doi.org/10.1109/ICOA.2019.8727617 WuYWangNKropczynskiJThe appropriation of GitHub for curationPeerJ Comput Sci2017310.7717/peerj-cs.134 Chalkidis I, Jana A, Hartung D et al. (2022) LexGLUE: a benchmark dataset for legal language understanding in English. In: Proceedings of the 60th annual meeting of the Association for Computational Linguistics (volume 1: long papers). Association for Computational Linguistics, Dublin, Ireland, pp 4310–4330. https://doi.org/10.18653/v1/2022.acl-long.297 Xiao C, Zhong H, Guo Z et al. (2019) CAIL2019-SCM: a dataset of similar case matching in legal domain. CoRR arxiv:1911.08962 Kralj NovakPScantamburloTPeliconACiucciDCousoIMedinaJHandling disagreement in hate speech modellingInformation processing and management of uncertainty in knowledge-based systems2022ChamSpringer68169510.1007/978-3-031-08974-9_54 Roegiest A, Hudek AK, McNulty A (2018) A dataset and an examination of identifying passages for due diligence. In: The 41st international ACM SIGIR conference on research & development in information retrieval. Association for Computing Machinery, New York, NY, USA, SIGIR ’18, pp 465–474. https://doi.org/10.1145/3209978.3210015 Holland S, Hosny A, Newman S et al. (2020) The dataset nutrition label. Data protection and privacy, volume 12: data protection and democracy 12:1 Steinberger R, Pouliquen B, Widiger A et al. (2006) The JRC-Acquis: a multilingual aligned parallel corpus with 20+ languages. In: Proceedings of the fifth international conference on language resources and evaluation (LREC’06). European Language Resources Association (ELRA), Genoa, Italy. http://www.lrec-conf.org/proceedings/lrec2006/pdf/340_pdf.pdf Borchmann Ł, Wisniewski D, Gretkowski A et al. (2020) Contract discovery: Dataset and a few-shot semantic retrieval challenge with competitive baselines. In: Findings of the association for computational linguistics: EMNLP 2020. Association for Computational Linguistics, Online, pp 4254–4268. https://doi.org/10.18653/v1/2020.findings-emnlp.380 Walker VR, Strong SR, Walker VE (2020) Automating the classification of finding sentences for linguistic polarity. In: Proceedings of the fourth workshop on automated semantic analysis of information in legal text SudreCHAnsonBGIngalaSShenDLiuTPetersTMLet’s agree to disagree: learning highly debatable multirater labellingMedical image computing and computer assisted intervention—MICCAI 20192019ChamSpringer665673 CampagnerACiucciDSvenssonCMGround truthing from multi-rater labeling with three-way decision and possibility theoryInf Sci202154577179010.1016/j.ins.2020.09.049 Guha N (2021) Datasets for machine learning in law. Tech. rep., Stanford University, https://github.com/neelguha/legal-ml-datasets Waltl B (2022) Legal text analytics. Tech. rep., L C Sas (9369_CR48) 2022; 190 AM Davani (9369_CR15) 2022; 10 S Zimmeck (9369_CR66) 2019; 2019 9369_CR42 9369_CR41 9369_CR40 9369_CR49 9369_CR47 9369_CR46 M Chinosi (9369_CR13) 2012; 34 9369_CR45 9369_CR44 9369_CR43 Y Wu (9369_CR60) 2017; 3 9369_CR52 9369_CR51 9369_CR50 9369_CR6 R Artstein (9369_CR2) 2017 9369_CR5 9369_CR16 9369_CR8 9369_CR59 9369_CR7 9369_CR58 9369_CR57 9369_CR1 9369_CR12 9369_CR56 9369_CR4 9369_CR11 9369_CR55 9369_CR10 9369_CR54 X Duan (9369_CR17) 2019 S Li (9369_CR34) 2017; 45 T Gebru (9369_CR19) 2021; 64 9369_CR20 9369_CR64 9369_CR63 9369_CR62 JL Fleiss (9369_CR18) 1971; 76 CH Sudre (9369_CR53) 2019 9369_CR61 9369_CR28 9369_CR27 9369_CR26 9369_CR25 9369_CR24 9369_CR23 J Cohen (9369_CR14) 1968; 70 9369_CR22 9369_CR21 9369_CR65 JR Landis (9369_CR33) 1977; 33 9369_CR29 K Krippendorff (9369_CR32) 2018 R Artstein (9369_CR3) 2008; 34 9369_CR30 A Campagner (9369_CR9) 2021; 545 9369_CR39 M Lippi (9369_CR35) 2019; 27 9369_CR38 9369_CR37 9369_CR36 P Kralj Novak (9369_CR31) 2022
References_xml	– reference: CohenJWeighted kappa: nominal scale agreement provision for scaled disagreement or partial creditPsychol Bull196870421310.1037/h0026256 – reference: Rottger P, Vidgen B, Hovy D et al. (2022) Two contrasting data annotation paradigms for subjective NLP tasks. In: Proceedings of the 2022 conference of the North American chapter of the Association for Computational Linguistics: Human Language Technologies. Association for Computational Linguistics, Seattle, United States, pp 175–190. https://doi.org/10.18653/v1/2022.naacl-main.13 – reference: ChinosiMTrombettaABPMN: an introduction to the standardComput Stand Interfaces201234112413410.1016/j.csi.2011.06.002 – reference: DavaniAMDíazMPrabhakaranVDealing with disagreements: looking beyond the majority vote in subjective annotationsTrans Assoc Comput Linguist2022109211010.1162/tacl_a_00449 – reference: Ramponi A, Leonardelli E (2022) DH-FBK at SemEval-2022 task 4: Leveraging annotators’ disagreement and multiple data views for patronizing language detection. In: Proceedings of the 16th international workshop on semantic evaluation (SemEval-2022). Association for Computational Linguistics, Seattle, United States, pp 324–334. https://doi.org/10.18653/v1/2022.semeval-1.42 – reference: Borchmann Ł, Wisniewski D, Gretkowski A et al. (2020) Contract discovery: Dataset and a few-shot semantic retrieval challenge with competitive baselines. In: Findings of the association for computational linguistics: EMNLP 2020. Association for Computational Linguistics, Online, pp 4254–4268. https://doi.org/10.18653/v1/2020.findings-emnlp.380 – reference: Tuggener D, von Däniken P, Peetz T et al. (2020) LEDGAR: a large-scale multi-label corpus for text classification of legal provisions in contracts. In: Proceedings of the twelfth language resources and evaluation conference. European Language Resources Association, Marseille, France, pp 1235–1241. https://aclanthology.org/2020.lrec-1.155 – reference: Zhong H, Xiao C, Tu C et al. (2020) JEC-QA: a legal-domain question answering dataset. In: Proceedings of the AAAI conference on artificial intelligence, vol 34, no. 05, pp 9701–9708. https://doi.org/10.1609/aaai.v34i05.6519 – reference: CampagnerACiucciDSvenssonCMGround truthing from multi-rater labeling with three-way decision and possibility theoryInf Sci202154577179010.1016/j.ins.2020.09.049 – reference: Urchs S, Mitrović J, Granitzer M (2021) Design and implementation of German legal decision corpora. In: Proceedings of the 13th international conference on agents and artificial intelligence—volume 2: ICAART, INSTICC. SciTePress, pp 515–521. https://doi.org/10.5220/0010187305150521 – reference: Sachdeva P, Barreto R, Bacon G et al. (2022) The measuring hate speech corpus: leveraging Rasch measurement theory for data perspectivism. In: Proceedings of the 1st workshop on perspectivist approaches to NLP @LREC2022. European Language Resources Association, Marseille, France, pp 83–94. https://aclanthology.org/2022.nlperspectives-1.11 – reference: Beigman Klebanov B, Beigman E, Diermeier D (2008) Analyzing disagreements. In: Coling 2008: proceedings of the workshop on human judgements in computational linguistics. Coling 2008 Organizing Committee, Manchester, UK, pp 2–7. https://aclanthology.org/W08-1202 – reference: Klemen M, Robnik-Šikonja M (2022) ULFRI at SemEval-2022 task 4: leveraging uncertainty and additional knowledge for patronizing and condescending language detection. In: Proceedings of the 16th international workshop on semantic evaluation (SemEval-2022). Association for Computational Linguistics, Seattle, United States, pp 525–532. https://doi.org/10.18653/v1/2022.semeval-1.73 – reference: Akhtar S, Basile V, Patti V (2020) Modeling annotator perspective and polarized opinions to improve hate speech detection. In: Proceedings of the AAAI conference on human computation and crowdsourcing, vol 8, no 1, pp 151–154. https://doi.org/10.1609/hcomp.v8i1.7473 – reference: Xiao C, Zhong H, Sun Y (2021) Must-read papers on legal intelligence. Tech. rep., Tsinghua University. https://github.com/thunlp/LegalPapers – reference: Waltl B (2022) Legal text analytics. Tech. rep., Liquid Legal Institute e.V. https://github.com/Liquid-Legal-Institute/Legal-Text-Analytics – reference: Xiao C, Zhong H, Guo Z et al. (2019) CAIL2019-SCM: a dataset of similar case matching in legal domain. CoRR arxiv:1911.08962 – reference: Guha N (2021) Datasets for machine learning in law. Tech. rep., Stanford University, https://github.com/neelguha/legal-ml-datasets – reference: Braun D, Matthes F (2022) Clause topic classification in German and English standard form contracts. In: Proceedings of the fifth workshop on e-commerce and NLP (ECNLP 5). Association for Computational Linguistics, Dublin, Ireland, pp 199–209. https://doi.org/10.18653/v1/2022.ecnlp-1.23 – reference: ZimmeckSStoryPSmullenDMaps: scaling privacy compliance analysis to a million appsProc Priv Enhanc Technol2019201966 – reference: Poudyal P, Savelka J, Ieven A et al. (2020) ECHR: legal corpus for argument mining. In: Proceedings of the 7th workshop on argument mining. Association for Computational Linguistics, Online, pp 67–75. https://aclanthology.org/2020.argmining-1.8 – reference: Lübbe-Wolff G (2022) Beratungskulturen: Wie verfassungsgerichte arbeiten, und wovon es abhängt, ob sie integrieren oder polarisieren. Tech. rep, Konrad-Adenauer-Stiftung – reference: GebruTMorgensternJVecchioneBDatasheets for datasetsCommun ACM20216412869210.1145/3458723 – reference: SudreCHAnsonBGIngalaSShenDLiuTPetersTMLet’s agree to disagree: learning highly debatable multirater labellingMedical image computing and computer assisted intervention—MICCAI 20192019ChamSpringer665673 – reference: Tiwari A, Kalamkar P, Agarwal A et al. (2022) Must-read papers on legal intelligence. Tech. rep., OpenNyAI. https://github.com/Legal-NLP-EkStep/rhetorical-role-baseline – reference: Gonzalez D, Zimmermann T, Nagappan N (2020) The state of the ML-universe: 10 years of artificial intelligence & machine learning software development on GitHub. In: Proceedings of the 17th international conference on mining software repositories. Association for Computing Machinery, New York, NY, USA, MSR ’20, pp 431–442. https://doi.org/10.1145/3379597.3387473 – reference: Walker VR, Strong SR, Walker VE (2020) Automating the classification of finding sentences for linguistic polarity. In: Proceedings of the fourth workshop on automated semantic analysis of information in legal text – reference: Hendrycks D, Burns C, Chen A et al. (2021) CUAD: an expert-annotated NLP dataset for legal contract review. CoRR arxiv:2103.06268 – reference: Keymanesh M, Elsner M, Sarthasarathy S (2020) Toward domain-guided controllable summarization of privacy policies. In: NLLP@ KDD, pp 18–24 – reference: Manor L, Li JJ (2019) Plain English summarization of contracts. In: Proceedings of the natural legal language processing workshop 2019. Association for Computational Linguistics, Minneapolis, Minnesota, pp 1–11. https://doi.org/10.18653/v1/W19-2201, https://aclanthology.org/W19-2201 – reference: Wilson S, Schaub F, Dara AA et al. (2016) The creation and analysis of a website privacy policy corpus. In: Proceedings of the 54th annual meeting of the Association for Computational Linguistics (volume 1: long papers). Association for Computational Linguistics, Berlin, Germany, pp 1330–1340. https://doi.org/10.18653/v1/P16-1126, https://aclanthology.org/P16-1126 – reference: Glaser I, Scepankova E, Matthes F (2018) Classifying semantic types of legal sentences: portability of machine learning models. In: Legal knowledge and information systems. IOS Press, pp 61–70 – reference: Schwarzer M (2022) awesome-legal-data. Tech. rep., Open Justive e.V., https://github.com/openlegaldata/awesome-legal-data – reference: Prabhakaran V, Mostafazadeh Davani A, Diaz M (2021) On releasing annotator-level labels and information in datasets. In: Proceedings of the Joint 15th linguistic annotation workshop (LAW) and 3rd designing meaning representations (DMR) workshop. Association for Computational Linguistics, Punta Cana, Dominican Republic, pp 133–138. https://doi.org/10.18653/v1/2021.law-1.14 – reference: LandisJRKochGGThe measurement of observer agreement for categorical dataBiometrics19773315917410.2307/2529310 – reference: Chalkidis I, Jana A, Hartung D et al. (2022) LexGLUE: a benchmark dataset for legal language understanding in English. In: Proceedings of the 60th annual meeting of the Association for Computational Linguistics (volume 1: long papers). Association for Computational Linguistics, Dublin, Ireland, pp 4310–4330. https://doi.org/10.18653/v1/2022.acl-long.297 – reference: Kralj NovakPScantamburloTPeliconACiucciDCousoIMedinaJHandling disagreement in hate speech modellingInformation processing and management of uncertainty in knowledge-based systems2022ChamSpringer68169510.1007/978-3-031-08974-9_54 – reference: ArtsteinRPoesioMInter-coder agreement for computational linguisticsComput Linguist200834455559610.1162/coli.07-034-R2 – reference: Holland S, Hosny A, Newman S et al. (2020) The dataset nutrition label. Data protection and privacy, volume 12: data protection and democracy 12:1 – reference: Roegiest A, Hudek AK, McNulty A (2018) A dataset and an examination of identifying passages for due diligence. In: The 41st international ACM SIGIR conference on research & development in information retrieval. Association for Computing Machinery, New York, NY, USA, SIGIR ’18, pp 465–474. https://doi.org/10.1145/3209978.3210015 – reference: ArtsteinRInter-annotator agreement2017DordrechtSpringer29731310.1007/978-94-024-0881-2_11 – reference: Ostendorff M, Blume T, Ostendorff S (2020) Towards an open platform for legal information. In: Proceedings of the ACM/IEEE joint conference on digital libraries in 2020. Association for Computing Machinery, New York, NY, USA, JCDL ’20, pp 385–388. https://doi.org/10.1145/3383583.3398616 – reference: Habernal I, Faber D, Recchia N et al. (2022) Mining legal arguments in court decisions. arXiv preprint https://doi.org/10.48550/arXiv.2208.06178 – reference: Louis A, Spanakis G (2022) A statutory article retrieval dataset in French. In: Proceedings of the 60th annual meeting of the Association for Computational Linguistics (volume 1: long papers). Association for Computational Linguistics, Dublin, Ireland, pp 6789–6803. https://doi.org/10.18653/v1/2022.acl-long.468 – reference: Wyner A, Peters W, Katz D (2013) A case study on legal case annotation. In: Legal knowledge and information systems. IOS Press, pp165–174 – reference: Savelka J, Xu H, Ashley KD (2019) Improving sentence retrieval from case law for statutory interpretation. In: Proceedings of the seventeenth international conference on artificial intelligence and law. Association for Computing Machinery, New York, NY, USA, ICAIL ’19, pp 113–122. https://doi.org/10.1145/3322640.3326736 – reference: WuYWangNKropczynskiJThe appropriation of GitHub for curationPeerJ Comput Sci2017310.7717/peerj-cs.134 – reference: Locke D, Zuccon G (2018) A test collection for evaluating legal case law search. In: The 41st international ACM SIGIR conference on research & development in information retrieval. Association for Computing Machinery, New York, NY, USA, SIGIR ’18, pp 1261–1264. https://doi.org/10.1145/3209978.3210161 – reference: Zahidi Y, El Younoussi Y, Azroumahli C (2019) Comparative study of the most useful Arabic-supporting natural language processing and deep learning libraries. In: 2019 5th international conference on optimization and applications (ICOA), pp 1–10. https://doi.org/10.1109/ICOA.2019.8727617 – reference: Steinberger R, Pouliquen B, Widiger A et al. (2006) The JRC-Acquis: a multilingual aligned parallel corpus with 20+ languages. In: Proceedings of the fifth international conference on language resources and evaluation (LREC’06). European Language Resources Association (ELRA), Genoa, Italy. http://www.lrec-conf.org/proceedings/lrec2006/pdf/340_pdf.pdf – reference: Braun D, Matthes F (2021) NLP for consumer protection: battling illegal clauses in German terms and conditions in online shopping. In: Proceedings of the 1st workshop on NLP for positive impact. Association for Computational Linguistics, Online, pp 93–99. https://doi.org/10.18653/v1/2021.nlp4posimpact-1.10 – reference: Chalkidis I, Androutsopoulos I, Michos A (2017) Extracting contract elements. In: Proceedings of the 16th edition of the international conference on artificial intelligence and law. Association for Computing Machinery, New York, NY, USA, ICAIL ’17, pp 19–28. https://doi.org/10.1145/3086512.3086515 – reference: KrippendorffKContent analysis: an introduction to its methodology20184Thousand OaksSage Publications – reference: Grover C, Hachey B, Hughson I (2004) The HOLJ corpus. Supporting summarisation of legal texts. In: Proceedings of the 5th international workshop on linguistically interpreted Corpora. COLING, Geneva, Switzerland, pp 47–54. https://aclanthology.org/W04-1907 – reference: Kalamkar P, Tiwari A, Agarwal A et al. (2022) Corpus for automatic structuring of legal documents. CoRR arxiv:2201.13125 – reference: Šavelka J, Ashley KD (2018) Segmenting us court decisions into functional and issue specific parts. In: Legal knowledge and information systems. IOS Press, pp 111–120 – reference: Basile V, Cabitza F, Campagner A et al. (2021) Toward a perspectivist turn in ground truthing for predictive computing. CoRR arxiv:2109.04270 – reference: Jamison E, Gurevych I (2015) Noise or additional information? leveraging crowdsource annotation item agreement for natural language tasks. In: Proceedings of the 2015 conference on empirical methods in natural language processing. Association for Computational Linguistics, Lisbon, Portugal, pp 291–297. https://doi.org/10.18653/v1/D15-1035 – reference: LiSA corpus-based study of vague language in legislative texts: strategic use of vague termsEngl Specif Purp2017459810910.1016/j.esp.2016.10.001 – reference: DuanXWangBWangZSunMHuangXJiHCJRC: a reliable human-annotated benchmark dataset for Chinese judicial reading comprehensionChinese computational linguistics2019ChamSpringer43945110.1007/978-3-030-32381-3_36 – reference: Chan B, Schweter S, Möller T (2020) German’s next language model. In: Proceedings of the 28th international conference on computational linguistics. International Committee on Computational Linguistics, Barcelona, Spain (Online), pp 6788–6796. https://doi.org/10.18653/v1/2020.coling-main.598 – reference: FleissJLMeasuring nominal scale agreement among many ratersPsychol Bull197176537810.1037/h0031619 – reference: SasCCapiluppiAAntipatterns in software classification taxonomiesJ Syst Softw202219011134310.1016/j.jss.2022.111343 – reference: Ovesdotter Alm C (2011) Subjective natural language problems: motivations, applications, characterizations, and implications. In: Proceedings of the 49th annual meeting of the Association for Computational Linguistics: Human Language Technologies. Association for Computational Linguistics, Portland, Oregon, USA, pp 107–112. https://aclanthology.org/P11-2019 – reference: LippiMPałkaPContissaGClaudette: an automated detector of potentially unfair clauses in online terms of serviceArtif Intell Law201927211713910.1007/s10506-019-09243-2 – reference: Drawzeski K, Galassi A, Jablonowska A et al. (2021) A corpus for multilingual analysis of online terms of service. In: Proceedings of the natural legal language processing workshop 2021. Association for Computational Linguistics, Punta Cana, Dominican Republic, pp 1–8. https://doi.org/10.18653/v1/2021.nllp-1.1 – ident: 9369_CR57 – ident: 9369_CR28 – ident: 9369_CR63 – ident: 9369_CR50 doi: 10.1145/3322640.3326736 – ident: 9369_CR44 doi: 10.18653/v1/2022.semeval-1.42 – volume: 76 start-page: 378 issue: 5 year: 1971 ident: 9369_CR18 publication-title: Psychol Bull doi: 10.1037/h0031619 – ident: 9369_CR25 – ident: 9369_CR43 doi: 10.18653/v1/2021.law-1.14 – volume: 2019 start-page: 66 year: 2019 ident: 9369_CR66 publication-title: Proc Priv Enhanc Technol – ident: 9369_CR10 doi: 10.1145/3086512.3086515 – volume: 64 start-page: 86 issue: 12 year: 2021 ident: 9369_CR19 publication-title: Commun ACM doi: 10.1145/3458723 – ident: 9369_CR56 doi: 10.5220/0010187305150521 – volume: 34 start-page: 124 issue: 1 year: 2012 ident: 9369_CR13 publication-title: Comput Stand Interfaces doi: 10.1016/j.csi.2011.06.002 – ident: 9369_CR16 doi: 10.18653/v1/2021.nllp-1.1 – ident: 9369_CR46 doi: 10.18653/v1/2022.naacl-main.13 – ident: 9369_CR1 doi: 10.1609/hcomp.v8i1.7473 – volume: 27 start-page: 117 issue: 2 year: 2019 ident: 9369_CR35 publication-title: Artif Intell Law doi: 10.1007/s10506-019-09243-2 – ident: 9369_CR54 – ident: 9369_CR40 doi: 10.1145/3383583.3398616 – ident: 9369_CR58 – ident: 9369_CR45 doi: 10.1145/3209978.3210015 – start-page: 665 volume-title: Medical image computing and computer assisted intervention—MICCAI 2019 year: 2019 ident: 9369_CR53 – ident: 9369_CR5 doi: 10.3115/1611628.1611630 – ident: 9369_CR29 – ident: 9369_CR12 doi: 10.18653/v1/2020.coling-main.598 – volume-title: Content analysis: an introduction to its methodology year: 2018 ident: 9369_CR32 – volume: 70 start-page: 213 issue: 4 year: 1968 ident: 9369_CR14 publication-title: Psychol Bull doi: 10.1037/h0026256 – ident: 9369_CR47 – ident: 9369_CR64 doi: 10.1109/ICOA.2019.8727617 – start-page: 439 volume-title: Chinese computational linguistics year: 2019 ident: 9369_CR17 doi: 10.1007/978-3-030-32381-3_36 – ident: 9369_CR22 – volume: 34 start-page: 555 issue: 4 year: 2008 ident: 9369_CR3 publication-title: Comput Linguist doi: 10.1162/coli.07-034-R2 – ident: 9369_CR59 doi: 10.18653/v1/P16-1126 – ident: 9369_CR4 – ident: 9369_CR30 doi: 10.18653/v1/2022.semeval-1.73 – ident: 9369_CR61 – ident: 9369_CR42 – ident: 9369_CR49 – start-page: 681 volume-title: Information processing and management of uncertainty in knowledge-based systems year: 2022 ident: 9369_CR31 doi: 10.1007/978-3-031-08974-9_54 – ident: 9369_CR23 – volume: 190 start-page: 343 issue: 111 year: 2022 ident: 9369_CR48 publication-title: J Syst Softw doi: 10.1016/j.jss.2022.111343 – ident: 9369_CR26 doi: 10.5040/9781509932771.ch-001 – ident: 9369_CR21 doi: 10.1145/3379597.3387473 – volume: 33 start-page: 159 year: 1977 ident: 9369_CR33 publication-title: Biometrics doi: 10.2307/2529310 – ident: 9369_CR52 – volume: 545 start-page: 771 year: 2021 ident: 9369_CR9 publication-title: Inf Sci doi: 10.1016/j.ins.2020.09.049 – ident: 9369_CR7 doi: 10.18653/v1/2021.nlp4posimpact-1.10 – volume: 3 year: 2017 ident: 9369_CR60 publication-title: PeerJ Comput Sci doi: 10.7717/peerj-cs.134 – ident: 9369_CR36 doi: 10.1145/3209978.3210161 – ident: 9369_CR62 – ident: 9369_CR8 doi: 10.18653/v1/2022.ecnlp-1.23 – ident: 9369_CR39 doi: 10.18653/v1/W19-2201 – ident: 9369_CR41 – volume: 10 start-page: 92 year: 2022 ident: 9369_CR15 publication-title: Trans Assoc Comput Linguist doi: 10.1162/tacl_a_00449 – ident: 9369_CR37 doi: 10.18653/v1/2022.acl-long.468 – ident: 9369_CR27 doi: 10.18653/v1/D15-1035 – start-page: 297 volume-title: Inter-annotator agreement year: 2017 ident: 9369_CR2 doi: 10.1007/978-94-024-0881-2_11 – ident: 9369_CR11 doi: 10.18653/v1/2022.acl-long.297 – ident: 9369_CR20 – volume: 45 start-page: 98 year: 2017 ident: 9369_CR34 publication-title: Engl Specif Purp doi: 10.1016/j.esp.2016.10.001 – ident: 9369_CR51 – ident: 9369_CR6 doi: 10.18653/v1/2020.findings-emnlp.380 – ident: 9369_CR24 doi: 10.48550/arXiv.2208.06178 – ident: 9369_CR38 – ident: 9369_CR55 – ident: 9369_CR65 doi: 10.1609/aaai.v34i05.6519
SSID	ssj0009693
Score	2.407643
Snippet	Legal documents, like contracts or laws, are subject to interpretation. Different people can have different interpretations of the very same document. Large...
SourceID	proquest gale crossref springer
SourceType	Aggregation Database Enrichment Source Index Database Publisher
StartPage	839
SubjectTerms	Annotations Annotations and citations (Law) Artificial Intelligence Computer Science Data analysis Datasets Documents Evaluation Information Storage and Retrieval Intellectual Property IT Law Legal Aspects of Computing Legislation Machine learning Media Law Original Research Peer review Philosophy of Law
SummonAdditionalLinks	– databaseName: ProQuest Technology Collection dbid: 8FG link: http://utb.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwfV1Lb9QwEB7RcuECpYBYaNEckDiAxSax44RLVSGWghAnKvVm-VkqtUkhW_XvdyZxugJEb5GSONGMPQ_P-PsAXrPPtSFKEaXWghGwhCVPLuJS66VXMsQwdlt8r4-O5dcTdZI33IbcVjnbxNFQh97zHvn7imJv2ZYUPhxc_hLMGsXV1UyhsQX3C_I0PM-b1ecN6G49ge5SjiHIz1b50Ew-Oqc4ly4rsWROOyH_cEx_m-d_6qSj-1ntwMMcN-LhpOjHcC92u_Bo5mTAvER3YeubvX4C7gu6eIrrHicClA_4s7-m68FSej1uCOLZgBPCQsCzDikMRNt1_VSYxz7heaSfw4ux1zJiJpc4Re4oxSGuh6dwvPr04-ORyGwKwlOOyJzzvi5d7dvG166IdZBVo2LtlK0o6PHOaS7qSRVUYXWjUqLA0Ccfk3a68FVZPYPtru_ic8DUUpZTOC1DtZShTda61AatElnO1hbFAopZlMZnqHFmvDg3G5BkFr8h8ZtR_EYu4O3tO5cT0MadT79hDRlehTSyt_kwAf0f41mZw4ahDWvKxhawNyvR5OU5mM1kWsC7WbGb2___7ou7R3sJD0oKeqYetD3YXv--ivsUtKzdq3Fm3gCt0eRq priority: 102 providerName: ProQuest
Title	I beg to differ: how disagreement is handled in the annotation of legal machine learning data sets
URI	https://link.springer.com/article/10.1007/s10506-023-09369-4 https://www.proquest.com/docview/3086492163
Volume	32
hasFullText	1
inHoldings	1
isFullTextHit
isPrint
link	http://utb.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwlV1Lb9QwEB6x2wsXCgXEllLNAYkDpNokfiTcltVuy6sgxErlZNmOUyraBJFUlfj1jBOnW8pD6iWJEsfxazwzmZlvAJ56nqsLxyLHpIw8AlakiZNHbirl1HJWuKLztjgUByv25ogfhaCwZvB2H0yS3U59JdiNe-03SaOpz0IXsRFs8DjLszFszPa_vF2swXZFD7ZLukVE_DUNwTJ_r-U3hnR9W_7DPtqxneUmrIYG994m3_bOW7Nnf17Dcrxpj-7CnSCH4qxfOPfglqu2YHPI8YCB5Ldg9E5f3AfzGo07xrbGPqHKS_xaX9B1o0ld734w4kmDPWJDgScVkliJuqrq3tCPdYmnjjqNZ53vpsOQrOIYvYcqNq5tHsBqufg8P4hCdobIks7pc9hbkRhh88wKEztRsDTjThiuUxKirDHSGwkZL3isZcbLkgRNW1pXSiNjmybpQxhXdeUeAZY5aU2xkaxIp6zIS61NmReSl7QT5zqOJxAPU6RsgC73GTRO1Rp02Q-loqFU3VAqNoHnl-9874E7_lv6mZ955amaarY6BCdQ-zw-lpplHipRkHY3gZ1hcahA7o1KSTFkeUKy7QReDHO9fvzv727frPhjuJ2QUNX7uO3AuP1x7p6QUNSaXRhly_3dQAl0frU4_PiJ7s7FnI7v4w90XCWzX_jmA6o
linkProvider	Springer Nature
linkToHtml	http://utb.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwtV1Lb9QwEB6V7QEuPAqIhQJzAHEAi01ixxukCrXQapcuK4RaqTfjV0qlkhSyaMWf4jd2nDhdAaK33iIlcaIZex72zPcBPAs-VzvPmedSsoCAxTR5cuZHUo6s4M67ttpink8O-YcjcbQGv_temFBW2dvE1lC72oY98tcZxd68SCl8eHv2nQXWqHC62lNodNNi3_9aUsrWbE3fk36fp-ne7sG7CYusAsxSrhS4122emtwWY5ubxOeOZ2PhcyN0Rs7fGiPD4RYXTiRajkVZUoBkS-tLaWRiswB0QCZ_nYeO1gGs7-zOP31ewfzmHcwvZTWMPHsW23Ris54I2XuasVFg0WP8D1f4t0P452S2dXh7t-FmjFRxu5tad2DNVxtwq2eBwGgUNuDaTC_vgpmi8ce4qLGjXHmDX-slXTeaEvp2CxJPGuwwHRyeVEiBJ-qqqrtSAKxLPPX0c_itre70GOksjjHUsGLjF809OLwSSd-HQVVX_gFgWVBelRjJXTbirii1NmXhpCjJVhc6SYaQ9KJUNoKbB46NU7WCZQ7iVyR-1Ypf8SG8vHjnrIP2uPTpF0FDKqx7Gtnq2L5A_xcQtNT2OIAp5pT_DWGzV6KKBqFRq-k7hFe9Yle3___dh5eP9hSuTw4-ztRsOt9_BDdSCrm6CrhNGCx-_PSPKWRamCdxniJ8ueqlcQ5R9yKE
linkToPdf	http://utb.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwtV1Lb9QwEB6VIiEuPAqoCwXmAOIAVjeJHSdICFWUpUurigOVejN-lkolKWTRir_Gr2OcOKwA0VtvkZI40cx4Hvb4-wCexJirnefMcylZRMBimiI581Mpp1Zw513fbXFY7h3x98fieA1-jmdhYlvl6BN7R-1aG9fItwvKvXmdU_qwHVJbxIfd2evzrywySMWd1pFOYzCRff9jSeVb92q-S7p-mueztx_f7LHEMMAs1U2Rh92WuSltXdnSZL50vKiEL43QBSUC1hgZN7q4cCLTshIhULJkg_VBGpnZIoIekPu_KgtZx8Kvmr1bAf6WA-Av1TeMYnyRDuykY3si1vF5waaRT4_xP4Li36Hhnz3aPvTNbsGNlLPizmBkt2HNNxtwc-SDwOQeNuDKgV7eATNH409w0eJAvvISP7dLuu40lfb9YiSedjigOzg8bZBSUNRN0w5NAdgGPPP0c_il7_P0mIgtTjB2s2LnF91dOLoUOd-D9aZt_CZgqKnCyozkrphyVwetTaidFIG8dq2zbALZKEplE8x5ZNs4UyuA5ih-ReJXvfgVn8Dz3--cDyAfFz79LGpIRQ9AI1udDjLQ_0UsLbVTRVjFkirBCWyNSlTJNXRqZcgTeDEqdnX7_9-9f_Foj-EaTQh1MD_cfwDXc8q9hla4LVhffPvuH1LutDCPeiNF-HTZs-IX_YUlVA
openUrl	ctx_ver=Z39.88-2004&ctx_enc=info%3Aofi%2Fenc%3AUTF-8&rfr_id=info%3Asid%2Fsummon.serialssolutions.com&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Ajournal&rft.genre=article&rft.atitle=I+beg+to+differ%3A+how+disagreement+is+handled+in+the+annotation+of+legal+machine+learning+data+sets&rft.jtitle=Artificial+intelligence+and+law&rft.au=Braun%2C+Daniel&rft.date=2024-09-01&rft.issn=0924-8463&rft.eissn=1572-8382&rft.volume=32&rft.issue=3&rft.spage=839&rft.epage=862&rft_id=info:doi/10.1007%2Fs10506-023-09369-4&rft.externalDBID=n%2Fa&rft.externalDocID=10_1007_s10506_023_09369_4
thumbnail_l	http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/lc.gif&issn=0924-8463&client=summon
thumbnail_m	http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/mc.gif&issn=0924-8463&client=summon
thumbnail_s	http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/sc.gif&issn=0924-8463&client=summon