I beg to differ: how disagreement is handled in the annotation of legal machine learning data sets

Legal documents, like contracts or laws, are subject to interpretation. Different people can have different interpretations of the very same document. Large parts of judicial branches all over the world are concerned with settling disagreements that arise, in part, from these different interpretatio...

Full description

Saved in:
Bibliographic Details
Published inArtificial intelligence and law Vol. 32; no. 3; pp. 839 - 862
Main Author Braun, Daniel
Format Journal Article
LanguageEnglish
Published Dordrecht Springer Netherlands 01.09.2024
Springer
Springer Nature B.V
Subjects
Online AccessGet full text
ISSN0924-8463
1572-8382
DOI10.1007/s10506-023-09369-4

Cover

Loading…
Abstract Legal documents, like contracts or laws, are subject to interpretation. Different people can have different interpretations of the very same document. Large parts of judicial branches all over the world are concerned with settling disagreements that arise, in part, from these different interpretations. In this context, it only seems natural that during the annotation of legal machine learning data sets, disagreement, how to report it, and how to handle it should play an important role. This article presents an analysis of the current state-of-the-art in the annotation of legal machine learning data sets. The results of the analysis show that all of the analysed data sets remove all traces of disagreement, instead of trying to utilise the information that might be contained in conflicting annotations. Additionally, the publications introducing the data sets often do provide little information about the process that derives the “gold standard” from the initial annotations, often making it difficult to judge the reliability of the annotation process. Based on the state-of-the-art, the article provides easily implementable suggestions on how to improve the handling and reporting of disagreement in the annotation of legal machine learning data sets.
AbstractList Legal documents, like contracts or laws, are subject to interpretation. Different people can have different interpretations of the very same document. Large parts of judicial branches all over the world are concerned with settling disagreements that arise, in part, from these different interpretations. In this context, it only seems natural that during the annotation of legal machine learning data sets, disagreement, how to report it, and how to handle it should play an important role. This article presents an analysis of the current state-of-the-art in the annotation of legal machine learning data sets. The results of the analysis show that all of the analysed data sets remove all traces of disagreement, instead of trying to utilise the information that might be contained in conflicting annotations. Additionally, the publications introducing the data sets often do provide little information about the process that derives the “gold standard” from the initial annotations, often making it difficult to judge the reliability of the annotation process. Based on the state-of-the-art, the article provides easily implementable suggestions on how to improve the handling and reporting of disagreement in the annotation of legal machine learning data sets.
Audience Professional
Author Braun, Daniel
Author_xml – sequence: 1
  givenname: Daniel
  orcidid: 0000-0001-8120-3368
  surname: Braun
  fullname: Braun, Daniel
  email: d.braun@utwente.nl
  organization: Department of High-Tech Business and Entrepreneurship, University of Twente
BookMark eNp9kU9rFjEQxoNU8G31C3gKeN46-bNJ1lspWguFXuo5ZLOTfVP2TWqSIn57oysIHsocMhme32Qmzzk5SzkhIe8ZXDIA_bEyGEENwMUAk1DTIF-RAxs1H4ww_IwcYOJyMFKJN-S81kcAmNQkDmS-pTOutGW6xBCwfKLH_KPn1a0F8YSp0Vjp0aVlw4XGRNsRqUspN9diTjQHuuHqNnpy_hgT9psrKaaVLq45WrHVt-R1cFvFd3_PC_Lty-eH66_D3f3N7fXV3eAl8DYw4RWflZ-MVzNDtUhhRlTz6IQC8POsQUmQ4zIyp80YgtbSB49Bz5p5wcUF-bD3fSr5-zPWZh_zc0n9SSvAKDlxpkRXXe6qPjTamEJuxfkeC56i778aYq9fGRCSK2l0B8wO-JJrLRisj_vyHYybZWB_W2B3C2y3wP6xwMqO8v_QpxJPrvx8GRI7VLs4rVj-rfEC9QuTJJo2
CitedBy_id crossref_primary_10_1007_s10506_024_09423_9
Cites_doi 10.1145/3322640.3326736
10.18653/v1/2022.semeval-1.42
10.1037/h0031619
10.18653/v1/2021.law-1.14
10.1145/3086512.3086515
10.1145/3458723
10.5220/0010187305150521
10.1016/j.csi.2011.06.002
10.18653/v1/2021.nllp-1.1
10.18653/v1/2022.naacl-main.13
10.1609/hcomp.v8i1.7473
10.1007/s10506-019-09243-2
10.1145/3383583.3398616
10.1145/3209978.3210015
10.3115/1611628.1611630
10.18653/v1/2020.coling-main.598
10.1037/h0026256
10.1109/ICOA.2019.8727617
10.1007/978-3-030-32381-3_36
10.1162/coli.07-034-R2
10.18653/v1/P16-1126
10.18653/v1/2022.semeval-1.73
10.1007/978-3-031-08974-9_54
10.1016/j.jss.2022.111343
10.5040/9781509932771.ch-001
10.1145/3379597.3387473
10.2307/2529310
10.1016/j.ins.2020.09.049
10.18653/v1/2021.nlp4posimpact-1.10
10.7717/peerj-cs.134
10.1145/3209978.3210161
10.18653/v1/2022.ecnlp-1.23
10.18653/v1/W19-2201
10.1162/tacl_a_00449
10.18653/v1/2022.acl-long.468
10.18653/v1/D15-1035
10.1007/978-94-024-0881-2_11
10.18653/v1/2022.acl-long.297
10.1016/j.esp.2016.10.001
10.18653/v1/2020.findings-emnlp.380
10.48550/arXiv.2208.06178
10.1609/aaai.v34i05.6519
ContentType Journal Article
Copyright The Author(s) 2023
COPYRIGHT 2024 Springer
The Author(s) 2023. This work is published under http://creativecommons.org/licenses/by/4.0/ (the “License”). Notwithstanding the ProQuest Terms and Conditions, you may use this content in accordance with the terms of the License.
Copyright_xml – notice: The Author(s) 2023
– notice: COPYRIGHT 2024 Springer
– notice: The Author(s) 2023. This work is published under http://creativecommons.org/licenses/by/4.0/ (the “License”). Notwithstanding the ProQuest Terms and Conditions, you may use this content in accordance with the terms of the License.
DBID C6C
AAYXX
CITATION
3V.
7SC
7XB
8AL
8FD
8FE
8FG
8FK
8G5
ABUWG
AFKRA
ALSLI
ARAPS
AZQEC
BENPR
BGLVJ
CCPQU
CNYFK
DWQXO
E3H
F2A
GNUQQ
GUQSH
HCIFZ
JQ2
K7-
L7M
L~C
L~D
M0N
M1O
M2O
MBDVC
P5Z
P62
PADUT
PHGZM
PHGZT
PKEHL
PQEST
PQGLB
PQQKQ
PQUKI
PRINS
PRQQA
Q9U
DOI 10.1007/s10506-023-09369-4
DatabaseName Springer Nature OA Free Journals
CrossRef
ProQuest Central (Corporate)
Computer and Information Systems Abstracts
ProQuest Central (purchase pre-March 2016)
Computing Database (Alumni Edition)
Technology Research Database
ProQuest SciTech Collection
ProQuest Technology Collection
ProQuest Central (Alumni) (purchase pre-March 2016)
Research Library (Alumni)
ProQuest Central
ProQuest Central UK/Ireland
Social Science Premium Collection
Advanced Technologies & Aerospace Collection
ProQuest Central Essentials
ProQuest Central
Technology Collection
ProQuest One Community College
Library & Information Science Collection
ProQuest Central
Library & Information Sciences Abstracts (LISA)
Library & Information Science Abstracts (LISA)
ProQuest Central Student
ProQuest Research Library
SciTech Premium Collection
ProQuest Computer Science Collection
Computer Science Database ProQuest
Advanced Technologies Database with Aerospace
Computer and Information Systems Abstracts – Academic
Computer and Information Systems Abstracts Professional
Computing Database
Library Science Database
Research Library
Research Library (Corporate)
Advanced Technologies & Aerospace Database
ProQuest Advanced Technologies & Aerospace Collection
Research Library China
ProQuest Central Premium
ProQuest One Academic (New)
ProQuest One Academic Middle East (New)
ProQuest One Academic Eastern Edition (DO NOT USE)
ProQuest One Applied & Life Sciences
ProQuest One Academic
ProQuest One Academic UKI Edition
ProQuest Central China
ProQuest One Social Sciences
ProQuest Central Basic
DatabaseTitle CrossRef
Research Library Prep
Computer Science Database
ProQuest Central Student
Technology Collection
Technology Research Database
Computer and Information Systems Abstracts – Academic
ProQuest One Academic Middle East (New)
Library and Information Science Abstracts (LISA)
ProQuest Advanced Technologies & Aerospace Collection
ProQuest Central Essentials
ProQuest Computer Science Collection
Computer and Information Systems Abstracts
ProQuest Central (Alumni Edition)
SciTech Premium Collection
ProQuest One Community College
Research Library (Alumni Edition)
ProQuest Central China
ProQuest Central
ProQuest One Applied & Life Sciences
ProQuest Library Science
ProQuest Central Korea
Library & Information Science Collection
ProQuest Research Library
Research Library China
ProQuest Central (New)
Advanced Technologies Database with Aerospace
Advanced Technologies & Aerospace Collection
Social Science Premium Collection
ProQuest Computing
ProQuest One Social Sciences
ProQuest Central Basic
ProQuest Computing (Alumni Edition)
ProQuest One Academic Eastern Edition
ProQuest Technology Collection
ProQuest SciTech Collection
Computer and Information Systems Abstracts Professional
Advanced Technologies & Aerospace Database
ProQuest One Academic UKI Edition
ProQuest One Academic
ProQuest One Academic (New)
ProQuest Central (Alumni)
DatabaseTitleList Research Library Prep


CrossRef
Database_xml – sequence: 1
  dbid: C6C
  name: Springer Nature OA Free Journals
  url: http://www.springeropen.com/
  sourceTypes: Publisher
– sequence: 2
  dbid: 8FG
  name: ProQuest Technology Collection
  url: https://search.proquest.com/technologycollection1
  sourceTypes: Aggregation Database
DeliveryMethod fulltext_linktorsrc
Discipline Law
Computer Science
EISSN 1572-8382
EndPage 862
ExternalDocumentID A803426487
10_1007_s10506_023_09369_4
GroupedDBID -4Z
-59
-5G
-BR
-EM
-Y2
-~C
.4L
.4S
.86
.DC
.VR
06D
0R~
0VY
1N0
1SB
2.D
203
23N
28-
2J2
2JN
2JY
2KG
2LR
2P1
2VQ
2~H
30V
3V.
4.4
406
408
409
40D
40E
5GY
5QI
5VS
67Z
6NX
77K
78A
8FE
8FG
8G5
8TC
8UJ
95-
95.
95~
96X
AAAVM
AABHQ
AACDK
AAHNG
AAIAL
AAJBT
AAJKR
AANZL
AAOBN
AARHV
AARTL
AASML
AATNV
AATVU
AAUYE
AAWCG
AAYIU
AAYQN
AAYTO
AAYZH
ABAKF
ABBBX
ABBXA
ABDZT
ABECU
ABFTD
ABFTV
ABHLI
ABHQN
ABJNI
ABJOX
ABKCH
ABKTR
ABMNI
ABMQK
ABNWP
ABQBU
ABQSL
ABSXP
ABTAH
ABTEG
ABTHY
ABTKH
ABTMW
ABULA
ABUWG
ABWNU
ABXPI
ACAOD
ACDTI
ACGFS
ACHSB
ACHXU
ACKNC
ACMDZ
ACMLO
ACOKC
ACOMO
ACPIV
ACYUM
ACZOJ
ADHHG
ADHIR
ADIMF
ADINQ
ADKNI
ADKPE
ADMLS
ADRFC
ADTPH
ADUOI
ADURQ
ADYFF
ADZKW
AEBTG
AEFIE
AEFQL
AEGAL
AEGNC
AEJHL
AEJRE
AEKMD
AEMSY
AEOHA
AEPYU
AESKC
AETLH
AEVLU
AEXYK
AFBBN
AFEXP
AFGCZ
AFKRA
AFLOW
AFQWF
AFWTZ
AFZKB
AGAYW
AGDGC
AGGDS
AGJBK
AGMZJ
AGQEE
AGQMX
AGRTI
AGWIL
AGWZB
AGYKE
AHAVH
AHBYD
AHKAY
AHSBF
AHYZX
AIAKS
AIGIU
AIIXL
AILAN
AITGF
AJBLW
AJRNO
AJZVZ
ALMA_UNASSIGNED_HOLDINGS
ALSLI
ALWAN
AMKLP
AMXSW
AMYLF
AMYQR
AOCGG
ARAPS
ARCSS
ARMRJ
ASPBG
AVWKF
AXYYD
AYJHY
AZFZN
AZQEC
B-.
BA0
BBWZM
BDATZ
BENPR
BGLVJ
BGNMA
BPHCQ
BSONS
C6C
CAG
CCPQU
CNYFK
COF
CS3
CSCUP
DDRTE
DL5
DNIVK
DPUIP
DWQXO
EBLON
EBS
EDO
EIOEI
EJD
ESBYG
FEDTE
FERAY
FFXSO
FIGPU
FINBP
FNLPD
FRRFC
FSGXE
FWDCC
GGCAI
GGRSB
GJIRD
GNUQQ
GNWQR
GQ6
GQ7
GQ8
GUQSH
GXS
HCIFZ
HF~
HG5
HG6
HISYW
HMJXF
HQYDN
HRMNR
HVGLF
HZ~
I09
IAO
ICD
IHE
IJ-
IKXTQ
ILT
ITC
ITM
IWAJR
IXC
IZIGR
IZQ
I~X
I~Z
J-C
J0Z
JBSCW
JCJTX
JZLTJ
K6V
K7-
KDC
KOV
KOW
LAK
LLZTM
M0N
M1O
M2O
M4Y
MA-
MK~
N2Q
NDZJH
NPVJJ
NQJWS
NU0
O-J
O9-
O93
O9G
O9I
O9J
OAM
OVD
P19
P2P
P62
P9O
PADUT
PF0
PQQKQ
PROAC
PT4
PT5
Q2X
QF4
QN5
QN7
QOK
QOS
R-Y
R4E
R89
R9I
RHO
RHV
RNI
RNS
ROL
RPX
RSV
RZC
RZD
RZK
S16
S1Z
S26
S27
S28
S3B
SAP
SCJ
SCLPG
SCO
SDH
SDM
SHX
SISQX
SJYHP
SNE
SNPRN
SNX
SOHCF
SOJ
SPISZ
SRMVM
SSLCW
STPWE
SZN
T13
T16
TEORI
TSG
TSK
TSV
TUC
TUS
U2A
UG4
UOJIU
UTJUX
UZXMN
VC2
VFIZW
W23
W48
WK6
WK8
YLTOR
Z45
Z7X
Z81
Z83
Z84
Z88
Z8U
Z8W
Z8Y
Z92
ZMTXR
ZY4
~A9
~EX
AAPKM
AAYXX
ABBRH
ABDBE
ABFSG
ACSTC
ADHKG
ADKFA
AEZWR
AFDZB
AFHIU
AFOHR
AGQPQ
AHPBZ
AHWEU
AIXLP
ATHPR
AYFIA
CITATION
PHGZM
PHGZT
77I
ABRTQ
7SC
7XB
8AL
8FD
8FK
E3H
F2A
JQ2
L7M
L~C
L~D
MBDVC
PKEHL
PQEST
PQGLB
PQUKI
PRINS
PRQQA
Q9U
ID FETCH-LOGICAL-c402t-13c62b6c98c6b1e6d4385e6b5a3600cbb7064045d51a785ff774cfcef7b71c323
IEDL.DBID AGYKE
ISSN 0924-8463
IngestDate Tue Sep 02 03:25:44 EDT 2025
Tue Sep 02 03:59:22 EDT 2025
Tue Jul 01 03:07:33 EDT 2025
Thu Apr 24 23:11:54 EDT 2025
Fri Feb 21 02:39:29 EST 2025
IsDoiOpenAccess true
IsOpenAccess true
IsPeerReviewed true
IsScholarly true
Issue 3
Keywords Annotator agreement
Data annotation
Legal corpora
Language English
LinkModel DirectLink
MergedId FETCHMERGED-LOGICAL-c402t-13c62b6c98c6b1e6d4385e6b5a3600cbb7064045d51a785ff774cfcef7b71c323
Notes ObjectType-Article-1
SourceType-Scholarly Journals-1
ObjectType-Feature-2
content type line 14
ORCID 0000-0001-8120-3368
OpenAccessLink https://proxy.k.utb.cz/login?url=https://link.springer.com/10.1007/s10506-023-09369-4
PQID 3086492163
PQPubID 30392
PageCount 24
ParticipantIDs proquest_journals_3086492163
gale_infotracacademiconefile_A803426487
crossref_citationtrail_10_1007_s10506_023_09369_4
crossref_primary_10_1007_s10506_023_09369_4
springer_journals_10_1007_s10506_023_09369_4
ProviderPackageCode CITATION
AAYXX
PublicationCentury 2000
PublicationDate 20240900
2024-09-00
20240901
PublicationDateYYYYMMDD 2024-09-01
PublicationDate_xml – month: 9
  year: 2024
  text: 20240900
PublicationDecade 2020
PublicationPlace Dordrecht
PublicationPlace_xml – name: Dordrecht
PublicationTitle Artificial intelligence and law
PublicationTitleAbbrev Artif Intell Law
PublicationYear 2024
Publisher Springer Netherlands
Springer
Springer Nature B.V
Publisher_xml – name: Springer Netherlands
– name: Springer
– name: Springer Nature B.V
References Braun D, Matthes F (2022) Clause topic classification in German and English standard form contracts. In: Proceedings of the fifth workshop on e-commerce and NLP (ECNLP 5). Association for Computational Linguistics, Dublin, Ireland, pp 199–209. https://doi.org/10.18653/v1/2022.ecnlp-1.23
Šavelka J, Ashley KD (2018) Segmenting us court decisions into functional and issue specific parts. In: Legal knowledge and information systems. IOS Press, pp 111–120
Manor L, Li JJ (2019) Plain English summarization of contracts. In: Proceedings of the natural legal language processing workshop 2019. Association for Computational Linguistics, Minneapolis, Minnesota, pp 1–11. https://doi.org/10.18653/v1/W19-2201, https://aclanthology.org/W19-2201
Ovesdotter Alm C (2011) Subjective natural language problems: motivations, applications, characterizations, and implications. In: Proceedings of the 49th annual meeting of the Association for Computational Linguistics: Human Language Technologies. Association for Computational Linguistics, Portland, Oregon, USA, pp 107–112. https://aclanthology.org/P11-2019
ZimmeckSStoryPSmullenDMaps: scaling privacy compliance analysis to a million appsProc Priv Enhanc Technol2019201966
LippiMPałkaPContissaGClaudette: an automated detector of potentially unfair clauses in online terms of serviceArtif Intell Law201927211713910.1007/s10506-019-09243-2
Xiao C, Zhong H, Sun Y (2021) Must-read papers on legal intelligence. Tech. rep., Tsinghua University. https://github.com/thunlp/LegalPapers
LandisJRKochGGThe measurement of observer agreement for categorical dataBiometrics19773315917410.2307/2529310
Poudyal P, Savelka J, Ieven A et al. (2020) ECHR: legal corpus for argument mining. In: Proceedings of the 7th workshop on argument mining. Association for Computational Linguistics, Online, pp 67–75. https://aclanthology.org/2020.argmining-1.8
DavaniAMDíazMPrabhakaranVDealing with disagreements: looking beyond the majority vote in subjective annotationsTrans Assoc Comput Linguist2022109211010.1162/tacl_a_00449
Tiwari A, Kalamkar P, Agarwal A et al. (2022) Must-read papers on legal intelligence. Tech. rep., OpenNyAI. https://github.com/Legal-NLP-EkStep/rhetorical-role-baseline
ArtsteinRInter-annotator agreement2017DordrechtSpringer29731310.1007/978-94-024-0881-2_11
DuanXWangBWangZSunMHuangXJiHCJRC: a reliable human-annotated benchmark dataset for Chinese judicial reading comprehensionChinese computational linguistics2019ChamSpringer43945110.1007/978-3-030-32381-3_36
Hendrycks D, Burns C, Chen A et al. (2021) CUAD: an expert-annotated NLP dataset for legal contract review. CoRR arxiv:2103.06268
Urchs S, Mitrović J, Granitzer M (2021) Design and implementation of German legal decision corpora. In: Proceedings of the 13th international conference on agents and artificial intelligence—volume 2: ICAART, INSTICC. SciTePress, pp 515–521. https://doi.org/10.5220/0010187305150521
Klemen M, Robnik-Šikonja M (2022) ULFRI at SemEval-2022 task 4: leveraging uncertainty and additional knowledge for patronizing and condescending language detection. In: Proceedings of the 16th international workshop on semantic evaluation (SemEval-2022). Association for Computational Linguistics, Seattle, United States, pp 525–532. https://doi.org/10.18653/v1/2022.semeval-1.73
Prabhakaran V, Mostafazadeh Davani A, Diaz M (2021) On releasing annotator-level labels and information in datasets. In: Proceedings of the Joint 15th linguistic annotation workshop (LAW) and 3rd designing meaning representations (DMR) workshop. Association for Computational Linguistics, Punta Cana, Dominican Republic, pp 133–138. https://doi.org/10.18653/v1/2021.law-1.14
FleissJLMeasuring nominal scale agreement among many ratersPsychol Bull197176537810.1037/h0031619
Rottger P, Vidgen B, Hovy D et al. (2022) Two contrasting data annotation paradigms for subjective NLP tasks. In: Proceedings of the 2022 conference of the North American chapter of the Association for Computational Linguistics: Human Language Technologies. Association for Computational Linguistics, Seattle, United States, pp 175–190. https://doi.org/10.18653/v1/2022.naacl-main.13
Akhtar S, Basile V, Patti V (2020) Modeling annotator perspective and polarized opinions to improve hate speech detection. In: Proceedings of the AAAI conference on human computation and crowdsourcing, vol 8, no 1, pp 151–154. https://doi.org/10.1609/hcomp.v8i1.7473
Glaser I, Scepankova E, Matthes F (2018) Classifying semantic types of legal sentences: portability of machine learning models. In: Legal knowledge and information systems. IOS Press, pp 61–70
Ostendorff M, Blume T, Ostendorff S (2020) Towards an open platform for legal information. In: Proceedings of the ACM/IEEE joint conference on digital libraries in 2020. Association for Computing Machinery, New York, NY, USA, JCDL ’20, pp 385–388. https://doi.org/10.1145/3383583.3398616
Zhong H, Xiao C, Tu C et al. (2020) JEC-QA: a legal-domain question answering dataset. In: Proceedings of the AAAI conference on artificial intelligence, vol 34, no. 05, pp 9701–9708. https://doi.org/10.1609/aaai.v34i05.6519
Drawzeski K, Galassi A, Jablonowska A et al. (2021) A corpus for multilingual analysis of online terms of service. In: Proceedings of the natural legal language processing workshop 2021. Association for Computational Linguistics, Punta Cana, Dominican Republic, pp 1–8. https://doi.org/10.18653/v1/2021.nllp-1.1
Chan B, Schweter S, Möller T (2020) German’s next language model. In: Proceedings of the 28th international conference on computational linguistics. International Committee on Computational Linguistics, Barcelona, Spain (Online), pp 6788–6796. https://doi.org/10.18653/v1/2020.coling-main.598
Tuggener D, von Däniken P, Peetz T et al. (2020) LEDGAR: a large-scale multi-label corpus for text classification of legal provisions in contracts. In: Proceedings of the twelfth language resources and evaluation conference. European Language Resources Association, Marseille, France, pp 1235–1241. https://aclanthology.org/2020.lrec-1.155
Beigman Klebanov B, Beigman E, Diermeier D (2008) Analyzing disagreements. In: Coling 2008: proceedings of the workshop on human judgements in computational linguistics. Coling 2008 Organizing Committee, Manchester, UK, pp 2–7. https://aclanthology.org/W08-1202
Gonzalez D, Zimmermann T, Nagappan N (2020) The state of the ML-universe: 10 years of artificial intelligence & machine learning software development on GitHub. In: Proceedings of the 17th international conference on mining software repositories. Association for Computing Machinery, New York, NY, USA, MSR ’20, pp 431–442. https://doi.org/10.1145/3379597.3387473
Locke D, Zuccon G (2018) A test collection for evaluating legal case law search. In: The 41st international ACM SIGIR conference on research & development in information retrieval. Association for Computing Machinery, New York, NY, USA, SIGIR ’18, pp 1261–1264. https://doi.org/10.1145/3209978.3210161
Zahidi Y, El Younoussi Y, Azroumahli C (2019) Comparative study of the most useful Arabic-supporting natural language processing and deep learning libraries. In: 2019 5th international conference on optimization and applications (ICOA), pp 1–10. https://doi.org/10.1109/ICOA.2019.8727617
WuYWangNKropczynskiJThe appropriation of GitHub for curationPeerJ Comput Sci2017310.7717/peerj-cs.134
Chalkidis I, Jana A, Hartung D et al. (2022) LexGLUE: a benchmark dataset for legal language understanding in English. In: Proceedings of the 60th annual meeting of the Association for Computational Linguistics (volume 1: long papers). Association for Computational Linguistics, Dublin, Ireland, pp 4310–4330. https://doi.org/10.18653/v1/2022.acl-long.297
Xiao C, Zhong H, Guo Z et al. (2019) CAIL2019-SCM: a dataset of similar case matching in legal domain. CoRR arxiv:1911.08962
Kralj NovakPScantamburloTPeliconACiucciDCousoIMedinaJHandling disagreement in hate speech modellingInformation processing and management of uncertainty in knowledge-based systems2022ChamSpringer68169510.1007/978-3-031-08974-9_54
Roegiest A, Hudek AK, McNulty A (2018) A dataset and an examination of identifying passages for due diligence. In: The 41st international ACM SIGIR conference on research & development in information retrieval. Association for Computing Machinery, New York, NY, USA, SIGIR ’18, pp 465–474. https://doi.org/10.1145/3209978.3210015
Holland S, Hosny A, Newman S et al. (2020) The dataset nutrition label. Data protection and privacy, volume 12: data protection and democracy 12:1
Steinberger R, Pouliquen B, Widiger A et al. (2006) The JRC-Acquis: a multilingual aligned parallel corpus with 20+ languages. In: Proceedings of the fifth international conference on language resources and evaluation (LREC’06). European Language Resources Association (ELRA), Genoa, Italy. http://www.lrec-conf.org/proceedings/lrec2006/pdf/340_pdf.pdf
Borchmann Ł, Wisniewski D, Gretkowski A et al. (2020) Contract discovery: Dataset and a few-shot semantic retrieval challenge with competitive baselines. In: Findings of the association for computational linguistics: EMNLP 2020. Association for Computational Linguistics, Online, pp 4254–4268. https://doi.org/10.18653/v1/2020.findings-emnlp.380
Walker VR, Strong SR, Walker VE (2020) Automating the classification of finding sentences for linguistic polarity. In: Proceedings of the fourth workshop on automated semantic analysis of information in legal text
SudreCHAnsonBGIngalaSShenDLiuTPetersTMLet’s agree to disagree: learning highly debatable multirater labellingMedical image computing and computer assisted intervention—MICCAI 20192019ChamSpringer665673
CampagnerACiucciDSvenssonCMGround truthing from multi-rater labeling with three-way decision and possibility theoryInf Sci202154577179010.1016/j.ins.2020.09.049
Guha N (2021) Datasets for machine learning in law. Tech. rep., Stanford University, https://github.com/neelguha/legal-ml-datasets
Waltl B (2022) Legal text analytics. Tech. rep., L
C Sas (9369_CR48) 2022; 190
AM Davani (9369_CR15) 2022; 10
S Zimmeck (9369_CR66) 2019; 2019
9369_CR42
9369_CR41
9369_CR40
9369_CR49
9369_CR47
9369_CR46
M Chinosi (9369_CR13) 2012; 34
9369_CR45
9369_CR44
9369_CR43
Y Wu (9369_CR60) 2017; 3
9369_CR52
9369_CR51
9369_CR50
9369_CR6
R Artstein (9369_CR2) 2017
9369_CR5
9369_CR16
9369_CR8
9369_CR59
9369_CR7
9369_CR58
9369_CR57
9369_CR1
9369_CR12
9369_CR56
9369_CR4
9369_CR11
9369_CR55
9369_CR10
9369_CR54
X Duan (9369_CR17) 2019
S Li (9369_CR34) 2017; 45
T Gebru (9369_CR19) 2021; 64
9369_CR20
9369_CR64
9369_CR63
9369_CR62
JL Fleiss (9369_CR18) 1971; 76
CH Sudre (9369_CR53) 2019
9369_CR61
9369_CR28
9369_CR27
9369_CR26
9369_CR25
9369_CR24
9369_CR23
J Cohen (9369_CR14) 1968; 70
9369_CR22
9369_CR21
9369_CR65
JR Landis (9369_CR33) 1977; 33
9369_CR29
K Krippendorff (9369_CR32) 2018
R Artstein (9369_CR3) 2008; 34
9369_CR30
A Campagner (9369_CR9) 2021; 545
9369_CR39
M Lippi (9369_CR35) 2019; 27
9369_CR38
9369_CR37
9369_CR36
P Kralj Novak (9369_CR31) 2022
References_xml – reference: CohenJWeighted kappa: nominal scale agreement provision for scaled disagreement or partial creditPsychol Bull196870421310.1037/h0026256
– reference: Rottger P, Vidgen B, Hovy D et al. (2022) Two contrasting data annotation paradigms for subjective NLP tasks. In: Proceedings of the 2022 conference of the North American chapter of the Association for Computational Linguistics: Human Language Technologies. Association for Computational Linguistics, Seattle, United States, pp 175–190. https://doi.org/10.18653/v1/2022.naacl-main.13
– reference: ChinosiMTrombettaABPMN: an introduction to the standardComput Stand Interfaces201234112413410.1016/j.csi.2011.06.002
– reference: DavaniAMDíazMPrabhakaranVDealing with disagreements: looking beyond the majority vote in subjective annotationsTrans Assoc Comput Linguist2022109211010.1162/tacl_a_00449
– reference: Ramponi A, Leonardelli E (2022) DH-FBK at SemEval-2022 task 4: Leveraging annotators’ disagreement and multiple data views for patronizing language detection. In: Proceedings of the 16th international workshop on semantic evaluation (SemEval-2022). Association for Computational Linguistics, Seattle, United States, pp 324–334. https://doi.org/10.18653/v1/2022.semeval-1.42
– reference: Borchmann Ł, Wisniewski D, Gretkowski A et al. (2020) Contract discovery: Dataset and a few-shot semantic retrieval challenge with competitive baselines. In: Findings of the association for computational linguistics: EMNLP 2020. Association for Computational Linguistics, Online, pp 4254–4268. https://doi.org/10.18653/v1/2020.findings-emnlp.380
– reference: Tuggener D, von Däniken P, Peetz T et al. (2020) LEDGAR: a large-scale multi-label corpus for text classification of legal provisions in contracts. In: Proceedings of the twelfth language resources and evaluation conference. European Language Resources Association, Marseille, France, pp 1235–1241. https://aclanthology.org/2020.lrec-1.155
– reference: Zhong H, Xiao C, Tu C et al. (2020) JEC-QA: a legal-domain question answering dataset. In: Proceedings of the AAAI conference on artificial intelligence, vol 34, no. 05, pp 9701–9708. https://doi.org/10.1609/aaai.v34i05.6519
– reference: CampagnerACiucciDSvenssonCMGround truthing from multi-rater labeling with three-way decision and possibility theoryInf Sci202154577179010.1016/j.ins.2020.09.049
– reference: Urchs S, Mitrović J, Granitzer M (2021) Design and implementation of German legal decision corpora. In: Proceedings of the 13th international conference on agents and artificial intelligence—volume 2: ICAART, INSTICC. SciTePress, pp 515–521. https://doi.org/10.5220/0010187305150521
– reference: Sachdeva P, Barreto R, Bacon G et al. (2022) The measuring hate speech corpus: leveraging Rasch measurement theory for data perspectivism. In: Proceedings of the 1st workshop on perspectivist approaches to NLP @LREC2022. European Language Resources Association, Marseille, France, pp 83–94. https://aclanthology.org/2022.nlperspectives-1.11
– reference: Beigman Klebanov B, Beigman E, Diermeier D (2008) Analyzing disagreements. In: Coling 2008: proceedings of the workshop on human judgements in computational linguistics. Coling 2008 Organizing Committee, Manchester, UK, pp 2–7. https://aclanthology.org/W08-1202
– reference: Klemen M, Robnik-Šikonja M (2022) ULFRI at SemEval-2022 task 4: leveraging uncertainty and additional knowledge for patronizing and condescending language detection. In: Proceedings of the 16th international workshop on semantic evaluation (SemEval-2022). Association for Computational Linguistics, Seattle, United States, pp 525–532. https://doi.org/10.18653/v1/2022.semeval-1.73
– reference: Akhtar S, Basile V, Patti V (2020) Modeling annotator perspective and polarized opinions to improve hate speech detection. In: Proceedings of the AAAI conference on human computation and crowdsourcing, vol 8, no 1, pp 151–154. https://doi.org/10.1609/hcomp.v8i1.7473
– reference: Xiao C, Zhong H, Sun Y (2021) Must-read papers on legal intelligence. Tech. rep., Tsinghua University. https://github.com/thunlp/LegalPapers
– reference: Waltl B (2022) Legal text analytics. Tech. rep., Liquid Legal Institute e.V. https://github.com/Liquid-Legal-Institute/Legal-Text-Analytics
– reference: Xiao C, Zhong H, Guo Z et al. (2019) CAIL2019-SCM: a dataset of similar case matching in legal domain. CoRR arxiv:1911.08962
– reference: Guha N (2021) Datasets for machine learning in law. Tech. rep., Stanford University, https://github.com/neelguha/legal-ml-datasets
– reference: Braun D, Matthes F (2022) Clause topic classification in German and English standard form contracts. In: Proceedings of the fifth workshop on e-commerce and NLP (ECNLP 5). Association for Computational Linguistics, Dublin, Ireland, pp 199–209. https://doi.org/10.18653/v1/2022.ecnlp-1.23
– reference: ZimmeckSStoryPSmullenDMaps: scaling privacy compliance analysis to a million appsProc Priv Enhanc Technol2019201966
– reference: Poudyal P, Savelka J, Ieven A et al. (2020) ECHR: legal corpus for argument mining. In: Proceedings of the 7th workshop on argument mining. Association for Computational Linguistics, Online, pp 67–75. https://aclanthology.org/2020.argmining-1.8
– reference: Lübbe-Wolff G (2022) Beratungskulturen: Wie verfassungsgerichte arbeiten, und wovon es abhängt, ob sie integrieren oder polarisieren. Tech. rep, Konrad-Adenauer-Stiftung
– reference: GebruTMorgensternJVecchioneBDatasheets for datasetsCommun ACM20216412869210.1145/3458723
– reference: SudreCHAnsonBGIngalaSShenDLiuTPetersTMLet’s agree to disagree: learning highly debatable multirater labellingMedical image computing and computer assisted intervention—MICCAI 20192019ChamSpringer665673
– reference: Tiwari A, Kalamkar P, Agarwal A et al. (2022) Must-read papers on legal intelligence. Tech. rep., OpenNyAI. https://github.com/Legal-NLP-EkStep/rhetorical-role-baseline
– reference: Gonzalez D, Zimmermann T, Nagappan N (2020) The state of the ML-universe: 10 years of artificial intelligence & machine learning software development on GitHub. In: Proceedings of the 17th international conference on mining software repositories. Association for Computing Machinery, New York, NY, USA, MSR ’20, pp 431–442. https://doi.org/10.1145/3379597.3387473
– reference: Walker VR, Strong SR, Walker VE (2020) Automating the classification of finding sentences for linguistic polarity. In: Proceedings of the fourth workshop on automated semantic analysis of information in legal text
– reference: Hendrycks D, Burns C, Chen A et al. (2021) CUAD: an expert-annotated NLP dataset for legal contract review. CoRR arxiv:2103.06268
– reference: Keymanesh M, Elsner M, Sarthasarathy S (2020) Toward domain-guided controllable summarization of privacy policies. In: NLLP@ KDD, pp 18–24
– reference: Manor L, Li JJ (2019) Plain English summarization of contracts. In: Proceedings of the natural legal language processing workshop 2019. Association for Computational Linguistics, Minneapolis, Minnesota, pp 1–11. https://doi.org/10.18653/v1/W19-2201, https://aclanthology.org/W19-2201
– reference: Wilson S, Schaub F, Dara AA et al. (2016) The creation and analysis of a website privacy policy corpus. In: Proceedings of the 54th annual meeting of the Association for Computational Linguistics (volume 1: long papers). Association for Computational Linguistics, Berlin, Germany, pp 1330–1340. https://doi.org/10.18653/v1/P16-1126, https://aclanthology.org/P16-1126
– reference: Glaser I, Scepankova E, Matthes F (2018) Classifying semantic types of legal sentences: portability of machine learning models. In: Legal knowledge and information systems. IOS Press, pp 61–70
– reference: Schwarzer M (2022) awesome-legal-data. Tech. rep., Open Justive e.V., https://github.com/openlegaldata/awesome-legal-data
– reference: Prabhakaran V, Mostafazadeh Davani A, Diaz M (2021) On releasing annotator-level labels and information in datasets. In: Proceedings of the Joint 15th linguistic annotation workshop (LAW) and 3rd designing meaning representations (DMR) workshop. Association for Computational Linguistics, Punta Cana, Dominican Republic, pp 133–138. https://doi.org/10.18653/v1/2021.law-1.14
– reference: LandisJRKochGGThe measurement of observer agreement for categorical dataBiometrics19773315917410.2307/2529310
– reference: Chalkidis I, Jana A, Hartung D et al. (2022) LexGLUE: a benchmark dataset for legal language understanding in English. In: Proceedings of the 60th annual meeting of the Association for Computational Linguistics (volume 1: long papers). Association for Computational Linguistics, Dublin, Ireland, pp 4310–4330. https://doi.org/10.18653/v1/2022.acl-long.297
– reference: Kralj NovakPScantamburloTPeliconACiucciDCousoIMedinaJHandling disagreement in hate speech modellingInformation processing and management of uncertainty in knowledge-based systems2022ChamSpringer68169510.1007/978-3-031-08974-9_54
– reference: ArtsteinRPoesioMInter-coder agreement for computational linguisticsComput Linguist200834455559610.1162/coli.07-034-R2
– reference: Holland S, Hosny A, Newman S et al. (2020) The dataset nutrition label. Data protection and privacy, volume 12: data protection and democracy 12:1
– reference: Roegiest A, Hudek AK, McNulty A (2018) A dataset and an examination of identifying passages for due diligence. In: The 41st international ACM SIGIR conference on research & development in information retrieval. Association for Computing Machinery, New York, NY, USA, SIGIR ’18, pp 465–474. https://doi.org/10.1145/3209978.3210015
– reference: ArtsteinRInter-annotator agreement2017DordrechtSpringer29731310.1007/978-94-024-0881-2_11
– reference: Ostendorff M, Blume T, Ostendorff S (2020) Towards an open platform for legal information. In: Proceedings of the ACM/IEEE joint conference on digital libraries in 2020. Association for Computing Machinery, New York, NY, USA, JCDL ’20, pp 385–388. https://doi.org/10.1145/3383583.3398616
– reference: Habernal I, Faber D, Recchia N et al. (2022) Mining legal arguments in court decisions. arXiv preprint https://doi.org/10.48550/arXiv.2208.06178
– reference: Louis A, Spanakis G (2022) A statutory article retrieval dataset in French. In: Proceedings of the 60th annual meeting of the Association for Computational Linguistics (volume 1: long papers). Association for Computational Linguistics, Dublin, Ireland, pp 6789–6803. https://doi.org/10.18653/v1/2022.acl-long.468
– reference: Wyner A, Peters W, Katz D (2013) A case study on legal case annotation. In: Legal knowledge and information systems. IOS Press, pp165–174
– reference: Savelka J, Xu H, Ashley KD (2019) Improving sentence retrieval from case law for statutory interpretation. In: Proceedings of the seventeenth international conference on artificial intelligence and law. Association for Computing Machinery, New York, NY, USA, ICAIL ’19, pp 113–122. https://doi.org/10.1145/3322640.3326736
– reference: WuYWangNKropczynskiJThe appropriation of GitHub for curationPeerJ Comput Sci2017310.7717/peerj-cs.134
– reference: Locke D, Zuccon G (2018) A test collection for evaluating legal case law search. In: The 41st international ACM SIGIR conference on research & development in information retrieval. Association for Computing Machinery, New York, NY, USA, SIGIR ’18, pp 1261–1264. https://doi.org/10.1145/3209978.3210161
– reference: Zahidi Y, El Younoussi Y, Azroumahli C (2019) Comparative study of the most useful Arabic-supporting natural language processing and deep learning libraries. In: 2019 5th international conference on optimization and applications (ICOA), pp 1–10. https://doi.org/10.1109/ICOA.2019.8727617
– reference: Steinberger R, Pouliquen B, Widiger A et al. (2006) The JRC-Acquis: a multilingual aligned parallel corpus with 20+ languages. In: Proceedings of the fifth international conference on language resources and evaluation (LREC’06). European Language Resources Association (ELRA), Genoa, Italy. http://www.lrec-conf.org/proceedings/lrec2006/pdf/340_pdf.pdf
– reference: Braun D, Matthes F (2021) NLP for consumer protection: battling illegal clauses in German terms and conditions in online shopping. In: Proceedings of the 1st workshop on NLP for positive impact. Association for Computational Linguistics, Online, pp 93–99. https://doi.org/10.18653/v1/2021.nlp4posimpact-1.10
– reference: Chalkidis I, Androutsopoulos I, Michos A (2017) Extracting contract elements. In: Proceedings of the 16th edition of the international conference on artificial intelligence and law. Association for Computing Machinery, New York, NY, USA, ICAIL ’17, pp 19–28. https://doi.org/10.1145/3086512.3086515
– reference: KrippendorffKContent analysis: an introduction to its methodology20184Thousand OaksSage Publications
– reference: Grover C, Hachey B, Hughson I (2004) The HOLJ corpus. Supporting summarisation of legal texts. In: Proceedings of the 5th international workshop on linguistically interpreted Corpora. COLING, Geneva, Switzerland, pp 47–54. https://aclanthology.org/W04-1907
– reference: Kalamkar P, Tiwari A, Agarwal A et al. (2022) Corpus for automatic structuring of legal documents. CoRR arxiv:2201.13125
– reference: Šavelka J, Ashley KD (2018) Segmenting us court decisions into functional and issue specific parts. In: Legal knowledge and information systems. IOS Press, pp 111–120
– reference: Basile V, Cabitza F, Campagner A et al. (2021) Toward a perspectivist turn in ground truthing for predictive computing. CoRR arxiv:2109.04270
– reference: Jamison E, Gurevych I (2015) Noise or additional information? leveraging crowdsource annotation item agreement for natural language tasks. In: Proceedings of the 2015 conference on empirical methods in natural language processing. Association for Computational Linguistics, Lisbon, Portugal, pp 291–297. https://doi.org/10.18653/v1/D15-1035
– reference: LiSA corpus-based study of vague language in legislative texts: strategic use of vague termsEngl Specif Purp2017459810910.1016/j.esp.2016.10.001
– reference: DuanXWangBWangZSunMHuangXJiHCJRC: a reliable human-annotated benchmark dataset for Chinese judicial reading comprehensionChinese computational linguistics2019ChamSpringer43945110.1007/978-3-030-32381-3_36
– reference: Chan B, Schweter S, Möller T (2020) German’s next language model. In: Proceedings of the 28th international conference on computational linguistics. International Committee on Computational Linguistics, Barcelona, Spain (Online), pp 6788–6796. https://doi.org/10.18653/v1/2020.coling-main.598
– reference: FleissJLMeasuring nominal scale agreement among many ratersPsychol Bull197176537810.1037/h0031619
– reference: SasCCapiluppiAAntipatterns in software classification taxonomiesJ Syst Softw202219011134310.1016/j.jss.2022.111343
– reference: Ovesdotter Alm C (2011) Subjective natural language problems: motivations, applications, characterizations, and implications. In: Proceedings of the 49th annual meeting of the Association for Computational Linguistics: Human Language Technologies. Association for Computational Linguistics, Portland, Oregon, USA, pp 107–112. https://aclanthology.org/P11-2019
– reference: LippiMPałkaPContissaGClaudette: an automated detector of potentially unfair clauses in online terms of serviceArtif Intell Law201927211713910.1007/s10506-019-09243-2
– reference: Drawzeski K, Galassi A, Jablonowska A et al. (2021) A corpus for multilingual analysis of online terms of service. In: Proceedings of the natural legal language processing workshop 2021. Association for Computational Linguistics, Punta Cana, Dominican Republic, pp 1–8. https://doi.org/10.18653/v1/2021.nllp-1.1
– ident: 9369_CR57
– ident: 9369_CR28
– ident: 9369_CR63
– ident: 9369_CR50
  doi: 10.1145/3322640.3326736
– ident: 9369_CR44
  doi: 10.18653/v1/2022.semeval-1.42
– volume: 76
  start-page: 378
  issue: 5
  year: 1971
  ident: 9369_CR18
  publication-title: Psychol Bull
  doi: 10.1037/h0031619
– ident: 9369_CR25
– ident: 9369_CR43
  doi: 10.18653/v1/2021.law-1.14
– volume: 2019
  start-page: 66
  year: 2019
  ident: 9369_CR66
  publication-title: Proc Priv Enhanc Technol
– ident: 9369_CR10
  doi: 10.1145/3086512.3086515
– volume: 64
  start-page: 86
  issue: 12
  year: 2021
  ident: 9369_CR19
  publication-title: Commun ACM
  doi: 10.1145/3458723
– ident: 9369_CR56
  doi: 10.5220/0010187305150521
– volume: 34
  start-page: 124
  issue: 1
  year: 2012
  ident: 9369_CR13
  publication-title: Comput Stand Interfaces
  doi: 10.1016/j.csi.2011.06.002
– ident: 9369_CR16
  doi: 10.18653/v1/2021.nllp-1.1
– ident: 9369_CR46
  doi: 10.18653/v1/2022.naacl-main.13
– ident: 9369_CR1
  doi: 10.1609/hcomp.v8i1.7473
– volume: 27
  start-page: 117
  issue: 2
  year: 2019
  ident: 9369_CR35
  publication-title: Artif Intell Law
  doi: 10.1007/s10506-019-09243-2
– ident: 9369_CR54
– ident: 9369_CR40
  doi: 10.1145/3383583.3398616
– ident: 9369_CR58
– ident: 9369_CR45
  doi: 10.1145/3209978.3210015
– start-page: 665
  volume-title: Medical image computing and computer assisted intervention—MICCAI 2019
  year: 2019
  ident: 9369_CR53
– ident: 9369_CR5
  doi: 10.3115/1611628.1611630
– ident: 9369_CR29
– ident: 9369_CR12
  doi: 10.18653/v1/2020.coling-main.598
– volume-title: Content analysis: an introduction to its methodology
  year: 2018
  ident: 9369_CR32
– volume: 70
  start-page: 213
  issue: 4
  year: 1968
  ident: 9369_CR14
  publication-title: Psychol Bull
  doi: 10.1037/h0026256
– ident: 9369_CR47
– ident: 9369_CR64
  doi: 10.1109/ICOA.2019.8727617
– start-page: 439
  volume-title: Chinese computational linguistics
  year: 2019
  ident: 9369_CR17
  doi: 10.1007/978-3-030-32381-3_36
– ident: 9369_CR22
– volume: 34
  start-page: 555
  issue: 4
  year: 2008
  ident: 9369_CR3
  publication-title: Comput Linguist
  doi: 10.1162/coli.07-034-R2
– ident: 9369_CR59
  doi: 10.18653/v1/P16-1126
– ident: 9369_CR4
– ident: 9369_CR30
  doi: 10.18653/v1/2022.semeval-1.73
– ident: 9369_CR61
– ident: 9369_CR42
– ident: 9369_CR49
– start-page: 681
  volume-title: Information processing and management of uncertainty in knowledge-based systems
  year: 2022
  ident: 9369_CR31
  doi: 10.1007/978-3-031-08974-9_54
– ident: 9369_CR23
– volume: 190
  start-page: 343
  issue: 111
  year: 2022
  ident: 9369_CR48
  publication-title: J Syst Softw
  doi: 10.1016/j.jss.2022.111343
– ident: 9369_CR26
  doi: 10.5040/9781509932771.ch-001
– ident: 9369_CR21
  doi: 10.1145/3379597.3387473
– volume: 33
  start-page: 159
  year: 1977
  ident: 9369_CR33
  publication-title: Biometrics
  doi: 10.2307/2529310
– ident: 9369_CR52
– volume: 545
  start-page: 771
  year: 2021
  ident: 9369_CR9
  publication-title: Inf Sci
  doi: 10.1016/j.ins.2020.09.049
– ident: 9369_CR7
  doi: 10.18653/v1/2021.nlp4posimpact-1.10
– volume: 3
  year: 2017
  ident: 9369_CR60
  publication-title: PeerJ Comput Sci
  doi: 10.7717/peerj-cs.134
– ident: 9369_CR36
  doi: 10.1145/3209978.3210161
– ident: 9369_CR62
– ident: 9369_CR8
  doi: 10.18653/v1/2022.ecnlp-1.23
– ident: 9369_CR39
  doi: 10.18653/v1/W19-2201
– ident: 9369_CR41
– volume: 10
  start-page: 92
  year: 2022
  ident: 9369_CR15
  publication-title: Trans Assoc Comput Linguist
  doi: 10.1162/tacl_a_00449
– ident: 9369_CR37
  doi: 10.18653/v1/2022.acl-long.468
– ident: 9369_CR27
  doi: 10.18653/v1/D15-1035
– start-page: 297
  volume-title: Inter-annotator agreement
  year: 2017
  ident: 9369_CR2
  doi: 10.1007/978-94-024-0881-2_11
– ident: 9369_CR11
  doi: 10.18653/v1/2022.acl-long.297
– ident: 9369_CR20
– volume: 45
  start-page: 98
  year: 2017
  ident: 9369_CR34
  publication-title: Engl Specif Purp
  doi: 10.1016/j.esp.2016.10.001
– ident: 9369_CR51
– ident: 9369_CR6
  doi: 10.18653/v1/2020.findings-emnlp.380
– ident: 9369_CR24
  doi: 10.48550/arXiv.2208.06178
– ident: 9369_CR38
– ident: 9369_CR55
– ident: 9369_CR65
  doi: 10.1609/aaai.v34i05.6519
SSID ssj0009693
Score 2.407643
Snippet Legal documents, like contracts or laws, are subject to interpretation. Different people can have different interpretations of the very same document. Large...
SourceID proquest
gale
crossref
springer
SourceType Aggregation Database
Enrichment Source
Index Database
Publisher
StartPage 839
SubjectTerms Annotations
Annotations and citations (Law)
Artificial Intelligence
Computer Science
Data analysis
Datasets
Documents
Evaluation
Information Storage and Retrieval
Intellectual Property
IT Law
Legal Aspects of Computing
Legislation
Machine learning
Media Law
Original Research
Peer review
Philosophy of Law
SummonAdditionalLinks – databaseName: ProQuest Technology Collection
  dbid: 8FG
  link: http://utb.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwfV1Lb9QwEB7RcuECpYBYaNEckDiAxSax44RLVSGWghAnKvVm-VkqtUkhW_XvdyZxugJEb5GSONGMPQ_P-PsAXrPPtSFKEaXWghGwhCVPLuJS66VXMsQwdlt8r4-O5dcTdZI33IbcVjnbxNFQh97zHvn7imJv2ZYUPhxc_hLMGsXV1UyhsQX3C_I0PM-b1ecN6G49ge5SjiHIz1b50Ew-Oqc4ly4rsWROOyH_cEx_m-d_6qSj-1ntwMMcN-LhpOjHcC92u_Bo5mTAvER3YeubvX4C7gu6eIrrHicClA_4s7-m68FSej1uCOLZgBPCQsCzDikMRNt1_VSYxz7heaSfw4ux1zJiJpc4Re4oxSGuh6dwvPr04-ORyGwKwlOOyJzzvi5d7dvG166IdZBVo2LtlK0o6PHOaS7qSRVUYXWjUqLA0Ccfk3a68FVZPYPtru_ic8DUUpZTOC1DtZShTda61AatElnO1hbFAopZlMZnqHFmvDg3G5BkFr8h8ZtR_EYu4O3tO5cT0MadT79hDRlehTSyt_kwAf0f41mZw4ahDWvKxhawNyvR5OU5mM1kWsC7WbGb2___7ou7R3sJD0oKeqYetD3YXv--ivsUtKzdq3Fm3gCt0eRq
  priority: 102
  providerName: ProQuest
Title I beg to differ: how disagreement is handled in the annotation of legal machine learning data sets
URI https://link.springer.com/article/10.1007/s10506-023-09369-4
https://www.proquest.com/docview/3086492163
Volume 32
hasFullText 1
inHoldings 1
isFullTextHit
isPrint
link http://utb.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwlV1Lb9QwEB6x2wsXCgXEllLNAYkDpNokfiTcltVuy6sgxErlZNmOUyraBJFUlfj1jBOnW8pD6iWJEsfxazwzmZlvAJ56nqsLxyLHpIw8AlakiZNHbirl1HJWuKLztjgUByv25ogfhaCwZvB2H0yS3U59JdiNe-03SaOpz0IXsRFs8DjLszFszPa_vF2swXZFD7ZLukVE_DUNwTJ_r-U3hnR9W_7DPtqxneUmrIYG994m3_bOW7Nnf17Dcrxpj-7CnSCH4qxfOPfglqu2YHPI8YCB5Ldg9E5f3AfzGo07xrbGPqHKS_xaX9B1o0ld734w4kmDPWJDgScVkliJuqrq3tCPdYmnjjqNZ53vpsOQrOIYvYcqNq5tHsBqufg8P4hCdobIks7pc9hbkRhh88wKEztRsDTjThiuUxKirDHSGwkZL3isZcbLkgRNW1pXSiNjmybpQxhXdeUeAZY5aU2xkaxIp6zIS61NmReSl7QT5zqOJxAPU6RsgC73GTRO1Rp02Q-loqFU3VAqNoHnl-9874E7_lv6mZ955amaarY6BCdQ-zw-lpplHipRkHY3gZ1hcahA7o1KSTFkeUKy7QReDHO9fvzv727frPhjuJ2QUNX7uO3AuP1x7p6QUNSaXRhly_3dQAl0frU4_PiJ7s7FnI7v4w90XCWzX_jmA6o
linkProvider Springer Nature
linkToHtml http://utb.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwtV1Lb9QwEB6V7QEuPAqIhQJzAHEAi01ixxukCrXQapcuK4RaqTfjV0qlkhSyaMWf4jd2nDhdAaK33iIlcaIZex72zPcBPAs-VzvPmedSsoCAxTR5cuZHUo6s4M67ttpink8O-YcjcbQGv_temFBW2dvE1lC72oY98tcZxd68SCl8eHv2nQXWqHC62lNodNNi3_9aUsrWbE3fk36fp-ne7sG7CYusAsxSrhS4122emtwWY5ubxOeOZ2PhcyN0Rs7fGiPD4RYXTiRajkVZUoBkS-tLaWRiswB0QCZ_nYeO1gGs7-zOP31ewfzmHcwvZTWMPHsW23Ris54I2XuasVFg0WP8D1f4t0P452S2dXh7t-FmjFRxu5tad2DNVxtwq2eBwGgUNuDaTC_vgpmi8ce4qLGjXHmDX-slXTeaEvp2CxJPGuwwHRyeVEiBJ-qqqrtSAKxLPPX0c_itre70GOksjjHUsGLjF809OLwSSd-HQVVX_gFgWVBelRjJXTbirii1NmXhpCjJVhc6SYaQ9KJUNoKbB46NU7WCZQ7iVyR-1Ypf8SG8vHjnrIP2uPTpF0FDKqx7Gtnq2L5A_xcQtNT2OIAp5pT_DWGzV6KKBqFRq-k7hFe9Yle3___dh5eP9hSuTw4-ztRsOt9_BDdSCrm6CrhNGCx-_PSPKWRamCdxniJ8ueqlcQ5R9yKE
linkToPdf http://utb.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwtV1Lb9QwEB6VIiEuPAqoCwXmAOIAVjeJHSdICFWUpUurigOVejN-lkolKWTRir_Gr2OcOKwA0VtvkZI40cx4Hvb4-wCexJirnefMcylZRMBimiI581Mpp1Zw513fbXFY7h3x98fieA1-jmdhYlvl6BN7R-1aG9fItwvKvXmdU_qwHVJbxIfd2evzrywySMWd1pFOYzCRff9jSeVb92q-S7p-mueztx_f7LHEMMAs1U2Rh92WuSltXdnSZL50vKiEL43QBSUC1hgZN7q4cCLTshIhULJkg_VBGpnZIoIekPu_KgtZx8Kvmr1bAf6WA-Av1TeMYnyRDuykY3si1vF5waaRT4_xP4Li36Hhnz3aPvTNbsGNlLPizmBkt2HNNxtwc-SDwOQeNuDKgV7eATNH409w0eJAvvISP7dLuu40lfb9YiSedjigOzg8bZBSUNRN0w5NAdgGPPP0c_il7_P0mIgtTjB2s2LnF91dOLoUOd-D9aZt_CZgqKnCyozkrphyVwetTaidFIG8dq2zbALZKEplE8x5ZNs4UyuA5ih-ReJXvfgVn8Dz3--cDyAfFz79LGpIRQ9AI1udDjLQ_0UsLbVTRVjFkirBCWyNSlTJNXRqZcgTeDEqdnX7_9-9f_Foj-EaTQh1MD_cfwDXc8q9hla4LVhffPvuH1LutDCPeiNF-HTZs-IX_YUlVA
openUrl ctx_ver=Z39.88-2004&ctx_enc=info%3Aofi%2Fenc%3AUTF-8&rfr_id=info%3Asid%2Fsummon.serialssolutions.com&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Ajournal&rft.genre=article&rft.atitle=I+beg+to+differ%3A+how+disagreement+is+handled+in+the+annotation+of+legal+machine+learning+data+sets&rft.jtitle=Artificial+intelligence+and+law&rft.au=Braun%2C+Daniel&rft.date=2024-09-01&rft.issn=0924-8463&rft.eissn=1572-8382&rft.volume=32&rft.issue=3&rft.spage=839&rft.epage=862&rft_id=info:doi/10.1007%2Fs10506-023-09369-4&rft.externalDBID=n%2Fa&rft.externalDocID=10_1007_s10506_023_09369_4
thumbnail_l http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/lc.gif&issn=0924-8463&client=summon
thumbnail_m http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/mc.gif&issn=0924-8463&client=summon
thumbnail_s http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/sc.gif&issn=0924-8463&client=summon