Collaborative annotation for reliable natural language processing : technical and sociological aspects

This book presents a unique opportunity for constructing a consistent image of collaborative manual annotation for Natural Language Processing (NLP). NLP has witnessed two major evolutions in the past 25 years: firstly, the extraordinary success of machine learning, which is now, for better or for w...

Full description

Saved in:
Bibliographic Details
Main Author Fort, Karën
Format eBook Book
LanguageEnglish
Published London ISTE 2016
Hoboken, N.J J. Wiley & Sons
John Wiley & Sons, Incorporated
Wiley-Blackwell
Wiley-ISTE
Edition1
Subjects
Online AccessGet full text

Cover

Loading…
Abstract This book presents a unique opportunity for constructing a consistent image of collaborative manual annotation for Natural Language Processing (NLP). NLP has witnessed two major evolutions in the past 25 years: firstly, the extraordinary success of machine learning, which is now, for better or for worse, overwhelmingly dominant in the field, and secondly, the multiplication of evaluation campaigns or shared tasks. Both involve manually annotated corpora, for the training and evaluation of the systems. These corpora have progressively become the hidden pillars of our domain, providing food for our hungry machine learning algorithms and reference for evaluation. Annotation is now the place where linguistics hides in NLP. However, manual annotation has largely been ignored for some time, and it has taken a while even for annotation guidelines to be recognized as essential. Although some efforts have been made lately to address some of the issues presented by manual annotation, there has still been little research done on the subject. This book aims to provide some useful insights into the subject. Manual corpus annotation is now at the heart of NLP, and is still largely unexplored. There is a need for manual annotation engineering (in the sense of a precisely formalized process), and this book aims to provide a first step towards a holistic methodology, with a global view on annotation.
AbstractList This book presents a unique opportunity for constructing a consistent image of collaborative manual annotation for Natural Language Processing (NLP). NLP has witnessed two major evolutions in the past 25 years: firstly, the extraordinary success of machine learning, which is now, for better or for worse, overwhelmingly dominant in the field, and secondly, the multiplication of evaluation campaigns or shared tasks. Both involve manually annotated corpora, for the training and evaluation of the systems. These corpora have progressively become the hidden pillars of our domain, providing food for our hungry machine learning algorithms and reference for evaluation. Annotation is now the place where linguistics hides in NLP. However, manual annotation has largely been ignored for some time, and it has taken a while even for annotation guidelines to be recognized as essential. Although some efforts have been made lately to address some of the issues presented by manual annotation, there has still been little research done on the subject. This book aims to provide some useful insights into the subject. Manual corpus annotation is now at the heart of NLP, and is still largely unexplored. There is a need for manual annotation engineering (in the sense of a precisely formalized process), and this book aims to provide a first step towards a holistic methodology, with a global view on annotation.
Author Fort, Karën
Author_xml – sequence: 1
  fullname: Fort, Karën
BackLink https://cir.nii.ac.jp/crid/1130282273284077312$$DView record in CiNii
https://hal.science/hal-01324322$$DView record in HAL
BookMark eNqNkEtvEzEUhY2giKZ0yX4WSIhF6L1-m10bFVopUjeI7eiO40lMjR3Gk1T8eyYMgi2b-_x0pHMW7EUuOTD2BuEDAvArZywiOgFaO_2MLebFaCmeT4uVlqMDCWdswQG1E1oDf8nOnUILzir9il3W-g0A0HInFZyzflVSoq4MNMZjaCjnMk5jyU1fhmYIKVKXQpNpPAyUmkR5e6BtaPZD8aHWmLfNx2YMfpejn_6UN00tPpZUtvOh7oMf62t21lOq4fJPv2BfP91-Wd0t1w-f71fX6yUpFM4uO-gRNn3vbC85V4DGWxU8UdeRsXqy13XCbHqSoI0PrlNCBeVQTfYckhMX7P0svKPU7of4nYafbaHY3l2v29MNUHApOD_iP5bqY3iqu5LG2h5T6Ep5rO3frI1W6v9ZaSf23cxOIf04hDq2vzEf8jhl2N7erKRSFvlJ9e1M5hhbH08VUQC3nBvBrQRjBHLxC27clG4
ContentType eBook
Book
Copyright Distributed under a Creative Commons Attribution 4.0 International License
Copyright_xml – notice: Distributed under a Creative Commons Attribution 4.0 International License
DBID RYH
1XC
VOOES
DEWEY 006.3/5
DOI 10.1002/9781119306696
DatabaseName CiNii Complete
Hyper Article en Ligne (HAL)
Hyper Article en Ligne (HAL) (Open Access)
DatabaseTitleList


DeliveryMethod fulltext_linktorsrc
Discipline Computer Science
EISBN 1119307643
9781119307648
9781119307655
1119307651
Edition 1
1st edition.
ExternalDocumentID oai_HAL_hal_01324322v1
9781119307655
9781119307648
EBC4558125
BB22489608
GroupedDBID 20A
38.
3XM
AAAUZ
AABBV
AARDG
ABARN
ABBFG
ABIAV
ABQPQ
ABQPW
ACGYG
ACLGV
ACNAM
ACNUM
ADJGH
ADVEM
AERYV
AFOJC
AFPKT
AHWGJ
AJFER
AKQZE
ALMA_UNASSIGNED_HOLDINGS
AMYDA
ASVIU
AZZ
BASKQ
BBABE
BPBUR
CZZ
DFSMB
GEOUK
IEZ
IPJKO
J-X
JFSCD
LQKAK
LWYJN
LYPXV
MFNMN
MPPRW
MUPLJ
MYL
OHILO
OHSWP
OODEK
OTAXI
PQQKQ
RYH
W1A
WIIVT
XWAVR
YPLAZ
ZEEST
1XC
VOOES
ID FETCH-LOGICAL-a51398-b0f10dff98f4225017c85ecaabba786219bb37dfa4067ce9b535e591509891a93
ISBN 1848219040
9781848219045
IngestDate Fri May 09 12:23:57 EDT 2025
Mon Feb 10 07:36:07 EST 2025
Fri Nov 08 06:04:43 EST 2024
Wed Aug 27 04:37:44 EDT 2025
Thu Jun 26 23:36:36 EDT 2025
IsDoiOpenAccess true
IsOpenAccess true
IsPeerReviewed false
IsScholarly false
Keywords annotation
inter-annotator agreement
crowdsourcing
ethics
LCCN 2016936602
LCCallNum_Ident QA76.9.N38 .F678 2016
Language English
License Distributed under a Creative Commons Attribution 4.0 International License: http://creativecommons.org/licenses/by/4.0
LinkModel OpenURL
MergedId FETCHMERGED-LOGICAL-a51398-b0f10dff98f4225017c85ecaabba786219bb37dfa4067ce9b535e591509891a93
Notes Bibliography: p. [143]-162
Includes index
OCLC 951809856
ORCID 0000-0002-0723-8850
OpenAccessLink https://hal.science/hal-01324322
PQID EBC4558125
PageCount 196
ParticipantIDs hal_primary_oai_HAL_hal_01324322v1
askewsholts_vlebooks_9781119307655
askewsholts_vlebooks_9781119307648
proquest_ebookcentral_EBC4558125
nii_cinii_1130282273284077312
PublicationCentury 2000
PublicationDate 2016
2016-06-14
2016-07-01
PublicationDateYYYYMMDD 2016-01-01
2016-06-14
2016-07-01
PublicationDate_xml – year: 2016
  text: 2016
PublicationDecade 2010
PublicationPlace London
Hoboken, N.J
PublicationPlace_xml – name: London
– name: Hoboken, N.J
– name: Newark
PublicationYear 2016
Publisher ISTE
J. Wiley & Sons
John Wiley & Sons, Incorporated
Wiley-Blackwell
Wiley-ISTE
Publisher_xml – name: ISTE
– name: J. Wiley & Sons
– name: John Wiley & Sons, Incorporated
– name: Wiley-Blackwell
– name: Wiley-ISTE
SSID ssj0001829450
ssib039627691
ssib037090189
Score 2.2525396
Snippet This book presents a unique opportunity for constructing a consistent image of collaborative manual annotation for Natural Language Processing (NLP). NLP has...
SourceID hal
askewsholts
proquest
nii
SourceType Open Access Repository
Aggregation Database
Publisher
SubjectTerms Computer Science
Document and Text Processing
Natural language processing (Computer science)
TableOfContents Cover -- Title Page -- Copyright -- Contents -- Preface -- List of Acronyms -- Introduction -- I.1. Natural Language Processing and manual annotation: Dr Jekyll and Mr Hy|ide? -- I.1.1. Where linguistics hides -- I.1.2. What is annotation? -- I.1.3. New forms, old issues -- I.2. Rediscovering annotation -- I.2.1. A rise in diversity and complexity -- I.2.2. Redefining manual annotation costs -- 1: Annotating Collaboratively -- 1.1. The annotation process (re)visited -- 1.1.1. Building consensus -- 1.1.2. Existing methodologies -- 1.1.3. Preparatory work -- 1.1.3.1. Identifying the actors -- 1.1.3.2. Taking the corpus into account -- 1.1.3.3. Creating and modifying the annotation guide -- 1.1.4. Pre-campaign -- 1.1.4.1. Building the mini-reference -- 1.1.4.2. Training the annotators -- 1.1.5. Annotation -- 1.1.5.1. Breaking-in -- 1.1.5.2. Annotating -- 1.1.5.3. Updating -- 1.1.6. Finalization -- 1.1.6.1. Failure -- 1.1.6.2. Adjudication -- 1.1.6.3. Reviewing -- 1.1.6.4. Publication -- 1.2. Annotation complexity -- 1.2.1. Example overview -- 1.2.1.1. Example 1: POS -- 1.2.1.2. Example 2: gene renaming -- 1.2.1.3. Example 3: structured named entities -- 1.2.2. What to annotate? -- 1.2.2.1. Discrimination -- 1.2.2.2. Delimitation -- 1.2.3. How to annotate? -- 1.2.3.1. Expressiveness of the annotation language -- 1.2.3.2. Tagset dimension -- 1.2.3.3. Degree of ambiguity -- 1.2.3.3.1. Residual ambiguity -- 1.2.3.3.2. Theoretical ambiguity -- 1.2.4. The weight of the context -- 1.2.5. Visualization -- 1.2.6. Elementary annotation tasks -- 1.2.6.1. Identifying gene names -- 1.2.6.2. Annotating gene renaming relations -- 1.3. Annotation tools -- 1.3.1. To be or not to be an annotation tool -- 1.3.2. Much more than prototypes -- 1.3.2.1. Taking the annotators into account -- 1.3.2.2. Standardizing the formalisms
1.3.3. Addressing the new annotation challenges -- 1.3.3.1. Towards more flexible and more generic tools -- 1.3.3.2. Towards more collaborative annotation -- 1.3.3.3. Towards the annotation campaign management -- 1.3.4. The impossible dream tool -- 1.4. Evaluating the annotation quality -- 1.4.1. What is annotation quality? -- 1.4.2. Understanding the basics -- 1.4.2.1. How lucky can you get? -- 1.4.2.2. The kappa family -- 1.4.2.2.1. Scott's pi -- 1.4.2.2.2. Cohen's kappa -- 1.4.2.3. The dark side of kappas -- 1.4.2.4. The F-measure: proceed with caution -- 1.4.3. Beyond kappas -- 1.4.3.1. Weighted coefficients -- 1.4.3.2. γ: the (nearly) universal metrics -- 1.4.4. Giving meaning to the metrics -- 1.4.4.1. The Corpus Shuffling Tool -- 1.4.4.2. Experimental results -- 1.4.4.2.1. Artificial annotations -- 1.4.4.2.2. Annotations from a real corpus -- 1.5. Conclusion -- 2: Crowdsourcing Annotation -- 2.1. What is crowdsourcing and why should we be interested in it? -- 2.1.1. A moving target -- 2.1.2. A massive success -- 2.2. Deconstructing the myths -- 2.2.1. Crowdsourcing is a recent phenomenon -- 2.2.2. Crowdsourcing involves a crowd (of non-experts) -- 2.2.3. "Crowdsourcing involves (a crowd of) non-experts" -- 2.3. Playing with a purpose -- 2.3.1. Using the players' innate capabilities and world knowledge -- 2.3.2. Using the players' school knowledge -- 2.3.3. Using the players' learning capacities -- 2.4. Acknowledging crowdsourcing specifics -- 2.4.1. Motivating the participants -- 2.4.2. Producing quality data -- 2.5. Ethical issues -- 2.5.1. Game ethics -- 2.5.2. What's wrong with Amazon Mechanical Turk? -- 2.5.3. A charter to rule them all -- Conclusion -- Appendix: (Some) Annotation Tools -- A.1. Generic tools -- A.1.1. Cadixe -- A.1.2. Callisto -- A.1.3. Amazon Mechanical Turk -- A.1.4. Knowtator -- A.1.5. MMAX2 -- A.1.6. UAM CorpusTool
A.1.7. Glozz -- A.1.8. CCASH -- A.1.9. brat -- A.2. Task-oriented tools -- A.2.1. LDC tools -- A.2.2. EasyRef -- A.2.3. Phrase Detectives -- A.2.4. ZombiLingo -- A.3. NLP annotation platforms -- A.3.1. GATE -- A.3.2. EULIA -- A.3.3. UIMA -- A.3.4. SYNC3 -- A.4. Annotation management tools -- A.4.1. Slate -- A.4.2. Djangology -- A.4.3. GATE Teamware -- A.4.4. WebAnno -- A.5. (Many) Other tools -- Glossary -- Bibliography -- Index -- Other titles from ISTE in Cognitive Science and Knowledge Management -- ELUA
Title Collaborative annotation for reliable natural language processing : technical and sociological aspects
URI https://cir.nii.ac.jp/crid/1130282273284077312
https://ebookcentral.proquest.com/lib/[SITE_ID]/detail.action?docID=4558125
https://www.vlebooks.com/vleweb/product/openreader?id=none&isbn=9781119307648&uid=none
https://www.vlebooks.com/vleweb/product/openreader?id=none&isbn=9781119307655
https://hal.science/hal-01324322
hasFullText 1
inHoldings 1
isFullTextHit
isPrint
link http://utb.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwnV3Pa9swFH6syWW9tFs7lv4YYuzqNrIl2zq2JSOMtqdu9GYkRWZhwx2Ll8P--n6y5ThJC_txEbYQMuiTv_f0pPeJ6IOUTmqrbWR46RcoIo8M7GyUpeNMW5UoXfo45M1tOv0sPt3L-_520Ca7pDZn9vezeSX_gyrqgKvPkv0HZFedogLPwBclEEa55fyuXoOmQI_e0guLVA9rhwb9KeMmIepWt6Ia1yEm2WUFdNaqvZyjDjlhzZb5ZbUeBuDbYYBnztkEkgmCyKtUqXblCIpT-L3TVubyCY-2uqyrdmmqtvSqGws4ubwSUsJFkDs0jEWaiAENYU4nN32YK4-VkOOgbopuzzc63aVdvfgGOgfV1wvY96_-OOpONZ8_sY2Nwb_bp6HzWSCv6IWrXtNed_cFC1R4QBcbELAeAgYIWAcBCxCwDgLWQ3BIXz5O7q6mUbiOItISfrKfxCUfz8pS5aUADYLLbC6d1doYnWFlyJUxSTYrNZykzDplZII_QcHlVrniWiVvaFA9VO4tMZHo3ObcCVMaMYs1qD43s1g4kSqV2XRE79fGpVh-b7bOF8UGcn_RSEo0wpgWP1rtksKriU8vrgtf57fZBAh9yUd0iiEv7NyX3O9lw2_0Mk5Y7mcJj0fEOjCK5iPh5HDRz4CjPzc5ppf9zD2hQf3zlzuFS1ebd2HaPAI13UZo
linkProvider ProQuest Ebooks
openUrl ctx_ver=Z39.88-2004&ctx_enc=info%3Aofi%2Fenc%3AUTF-8&rfr_id=info%3Asid%2Fsummon.serialssolutions.com&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Abook&rft.genre=book&rft.title=Collaborative+Annotation+for+Reliable+Natural+Language+Processing&rft.au=Fort%2C+Kar%C3%ABn&rft.date=2016-01-01&rft.pub=John+Wiley+%26+Sons%2C+Incorporated&rft.isbn=9781119307648&rft_id=info:doi/10.1002%2F9781119306696&rft.externalDocID=EBC4558125
thumbnail_m http://utb.summon.serialssolutions.com/2.0.0/image/custom?url=https%3A%2F%2Fvle.dmmserver.com%2Fmedia%2F640%2F97811193%2F9781119307648.jpg
http://utb.summon.serialssolutions.com/2.0.0/image/custom?url=https%3A%2F%2Fvle.dmmserver.com%2Fmedia%2F640%2F97811193%2F9781119307655.jpg