Hate Speech Classifiers Learn Normative Social Stereotypes

Social stereotypes negatively impact individuals’ judgments about different groups and may have a critical role in understanding language directed toward marginalized groups. Here, we assess the role of social stereotypes in the automated detection of hate speech in the English language by examining...

Full description

Saved in:
Bibliographic Details
Published inTransactions of the Association for Computational Linguistics Vol. 11; pp. 300 - 319
Main Authors Davani, Aida Mostafazadeh, Atari, Mohammad, Kennedy, Brendan, Dehghani, Morteza
Format Journal Article
LanguageEnglish
Published One Broadway, 12th Floor, Cambridge, Massachusetts 02142, USA MIT Press 22.03.2023
MIT Press Journals, The
The MIT Press
Subjects
Online AccessGet full text
ISSN2307-387X
2307-387X
DOI10.1162/tacl_a_00550

Cover

Abstract Social stereotypes negatively impact individuals’ judgments about different groups and may have a critical role in understanding language directed toward marginalized groups. Here, we assess the role of social stereotypes in the automated detection of hate speech in the English language by examining the impact of social stereotypes on annotation behaviors, annotated datasets, and hate speech classifiers. Specifically, we first investigate the impact of novice annotators’ stereotypes on their hate-speech-annotation behavior. Then, we examine the effect of normative stereotypes in language on the aggregated annotators’ judgments in a large annotated corpus. Finally, we demonstrate how normative stereotypes embedded in language resources are associated with systematic prediction errors in a hate-speech classifier. The results demonstrate that hate-speech classifiers reflect social stereotypes against marginalized groups, which can perpetuate social inequalities when propagated at scale. This framework, combining social-psychological and computational-linguistic methods, provides insights into sources of bias in hate-speech moderation, informing ongoing debates regarding machine learning fairness.
AbstractList AbstractSocial stereotypes negatively impact individuals’ judgments about different groups and may have a critical role in understanding language directed toward marginalized groups. Here, we assess the role of social stereotypes in the automated detection of hate speech in the English language by examining the impact of social stereotypes on annotation behaviors, annotated datasets, and hate speech classifiers. Specifically, we first investigate the impact of novice annotators’ stereotypes on their hate-speech-annotation behavior. Then, we examine the effect of normative stereotypes in language on the aggregated annotators’ judgments in a large annotated corpus. Finally, we demonstrate how normative stereotypes embedded in language resources are associated with systematic prediction errors in a hate-speech classifier. The results demonstrate that hate-speech classifiers reflect social stereotypes against marginalized groups, which can perpetuate social inequalities when propagated at scale. This framework, combining social-psychological and computational-linguistic methods, provides insights into sources of bias in hate-speech moderation, informing ongoing debates regarding machine learning fairness.
Social stereotypes negatively impact individuals’ judgments about different groups and may have a critical role in understanding language directed toward marginalized groups. Here, we assess the role of social stereotypes in the automated detection of hate speech in the English language by examining the impact of social stereotypes on annotation behaviors, annotated datasets, and hate speech classifiers. Specifically, we first investigate the impact of novice annotators’ stereotypes on their hate-speech-annotation behavior. Then, we examine the effect of normative stereotypes in language on the aggregated annotators’ judgments in a large annotated corpus. Finally, we demonstrate how normative stereotypes embedded in language resources are associated with systematic prediction errors in a hate-speech classifier. The results demonstrate that hate-speech classifiers reflect social stereotypes against marginalized groups, which can perpetuate social inequalities when propagated at scale. This framework, combining social-psychological and computational-linguistic methods, provides insights into sources of bias in hate-speech moderation, informing ongoing debates regarding machine learning fairness.
Author Dehghani, Morteza
Davani, Aida Mostafazadeh
Kennedy, Brendan
Atari, Mohammad
Author_xml – sequence: 1
  givenname: Aida Mostafazadeh
  surname: Davani
  fullname: Davani, Aida Mostafazadeh
  organization: University of Southern California, USA. mostafaz@usc.edu
– sequence: 2
  givenname: Mohammad
  surname: Atari
  fullname: Atari, Mohammad
  email: atari@usc.edu
  organization: University of Southern California, USA. atari@usc.edu
– sequence: 3
  givenname: Brendan
  surname: Kennedy
  fullname: Kennedy, Brendan
  organization: University of Southern California, USA. btkenned@usc.edu
– sequence: 4
  givenname: Morteza
  surname: Dehghani
  fullname: Dehghani, Morteza
  organization: University of Southern California, USA. mdehghan@usc.edu
BookMark eNp1kFFrFDEUhYNUsNa--QMGfPHB1XszySTxQZBFbWHRhyr4Fu5m72iW2cmYpIX21zvtKmzFPt0QvvNxOE_F0ZhGFuI5wmvETr6pFAZPHkBreCSOZQtm0Vrz_ejg_USclrIFALRooZPH4u0ZVW4uJubws1kOVErsI-fSrJjy2HxOeUc1Xs1ICpGG5qJy5lSvJy7PxOOehsKnf-6J-Pbxw9fl2WL15dP58v1qEZTWdaEx0MaZHnq12chAnbXadUaSkYZZgVWk2TIhUrAtogprQOlc3yupybj2RJzvvZtEWz_luKN87RNFf_eR8g9PucYwsEdpVBvcWoIzSsHa4lr1YFHibAcZZteLvWvK6dcll-q36TKPc30vrWud6mynZ0ruqZBTKZl7H2KdZ0hjzRQHj-BvJ_eHk8-hV_-E_lZ9AH-5x3fxoMQD6Lv_oLfIFaJvlenmmATZ-rvrb-J0X_Ab6Z6ksA
CitedBy_id crossref_primary_10_1007_s00146_024_01945_9
crossref_primary_10_7592_Tertium_2023_8_2_245
crossref_primary_10_1007_s00521_024_10841_8
crossref_primary_10_1016_j_eswa_2025_127188
crossref_primary_10_1109_ACCESS_2024_3360306
crossref_primary_10_1017_nlp_2024_18
crossref_primary_10_31637_epsir_2024_522
crossref_primary_10_1007_s11943_023_00332_y
Cites_doi 10.18653/v1/2020.acl-main.487
10.1162/tacl_a_00293
10.18653/v1/2022.naacl-main.13
10.18653/v1/2022.naacl-main.431
10.1145/3531146.3534647
10.1145/2684822.2685316
10.1111/lnc3.12432
10.1038/s41586-021-03666-1
10.1371/journal.pone.0256762
10.18653/v1/D19-1107
10.1613/jair.1.12752
10.1162/tacl_a_00449
10.1002/ejsp.2561
10.18653/v1/2020.acl-main.483
10.1007/s10579-021-09569-x
10.1109/ICDM51629.2021.00140
10.1037/0022-3514.92.4.631
10.24251/HICSS.2019.260
10.1073/pnas.1720347115
10.18653/v1/2021.acl-long.132
10.1126/science.aax2342
10.3115/v1/D14-1162
10.1140/epjds/s13688-022-00319-9
10.1613/jair.1.12590
10.1145/3278721.3278729
10.1111/1540-4560.00259
10.18653/v1/2020.alw-1.22
10.1145/3308560.3317083
10.1145/2783258.2783311
10.1371/journal.pone.0237861
10.1016/S0065-2601(07)00002-0
10.18653/v1/P19-1163
10.1037/pspa0000046
10.18653/v1/2020.socialnlp-1.2
10.18653/v1/W19-3504
10.1177/0146167205275613
10.18653/v1/2021.woah-1.12
10.1145/3457607
10.1037/0022-3514.82.6.878
10.1177/0956797620963619
10.18653/v1/D19-1578
10.1145/3306618.3314270
10.18653/v1/2020.emnlp-demos.6
10.1145/3306618.3317950
10.1111/spc3.12181
10.1126/science.aal4230
10.1037/h0031619
10.18653/v1/2021.law-1.14
10.1111/j.1532-7795.2012.00785.x
10.18653/v1/2020.acl-main.485
10.1093/jamia/ocaa085
10.1145/3308560.3317593
10.1609/icwsm.v11i1.14955
10.1145/3350546.3352512
10.18653/v1/2021.woah-1.10
10.1609/aaai.v27i1.8539
10.1007/BF00994018
10.1145/3038912.3052693
10.1177/1745691611406922
10.1145/3442188.3445922
10.1162/tacl_a_00425
10.18653/v1/N19-1062
10.1145/3308558.3313504
10.1037/pspa0000080
ContentType Journal Article
Copyright 2023. This work is published under https://creativecommons.org/licenses/by/4.0/legalcode (the “License”). Notwithstanding the ProQuest Terms and Conditions, you may use this content in accordance with the terms of the License.
Copyright_xml – notice: 2023. This work is published under https://creativecommons.org/licenses/by/4.0/legalcode (the “License”). Notwithstanding the ProQuest Terms and Conditions, you may use this content in accordance with the terms of the License.
DBID AAYXX
CITATION
7T9
8FE
8FG
ABUWG
AFKRA
ALSLI
ARAPS
AZQEC
BENPR
BGLVJ
CCPQU
CPGLG
CRLPW
DWQXO
GNUQQ
HCIFZ
JQ2
K7-
P5Z
P62
PHGZM
PHGZT
PIMPY
PKEHL
PQEST
PQGLB
PQQKQ
PQUKI
PRQQA
DOA
DOI 10.1162/tacl_a_00550
DatabaseName CrossRef
Linguistics and Language Behavior Abstracts (LLBA)
ProQuest SciTech Collection
ProQuest Technology Collection
ProQuest Central
ProQuest Central UK/Ireland
Social Science Premium Collection
Advanced Technologies & Aerospace Collection
ProQuest Central Essentials
ProQuest Central
Technology Collection (via ProQuest SciTech Premium Collection)
ProQuest One Community College
Linguistics Collection
Linguistics Database
ProQuest Central Korea
ProQuest Central Student
ProQuest SciTech Premium Collection
ProQuest Computer Science Collection
Computer Science Database
Advanced Technologies & Aerospace Database
ProQuest Advanced Technologies & Aerospace Collection
Proquest Central Premium
ProQuest One Academic (New)
Publicly Available Content Database
ProQuest One Academic Middle East (New)
ProQuest One Academic Eastern Edition (DO NOT USE)
ProQuest One Applied & Life Sciences
ProQuest One Academic
ProQuest One Academic UKI Edition
ProQuest One Social Sciences
DOAJ Directory of Open Access Journals
DatabaseTitle CrossRef
Publicly Available Content Database
Computer Science Database
ProQuest Central Student
Technology Collection
ProQuest One Academic Middle East (New)
ProQuest Advanced Technologies & Aerospace Collection
ProQuest Central Essentials
ProQuest Computer Science Collection
ProQuest Central (Alumni Edition)
SciTech Premium Collection
ProQuest One Community College
ProQuest Central
ProQuest One Applied & Life Sciences
Linguistics Collection
ProQuest Central Korea
ProQuest Central (New)
Advanced Technologies & Aerospace Collection
Social Science Premium Collection
ProQuest One Social Sciences
ProQuest One Academic Eastern Edition
Linguistics and Language Behavior Abstracts (LLBA)
ProQuest Technology Collection
ProQuest SciTech Collection
Advanced Technologies & Aerospace Database
ProQuest One Academic UKI Edition
Linguistics Database
ProQuest One Academic
ProQuest One Academic (New)
DatabaseTitleList
CrossRef

Publicly Available Content Database
Database_xml – sequence: 1
  dbid: DOA
  name: DOAJ Directory of Open Access Journals
  url: https://www.doaj.org/
  sourceTypes: Open Website
– sequence: 2
  dbid: 8FG
  name: ProQuest Technology Collection
  url: https://search.proquest.com/technologycollection1
  sourceTypes: Aggregation Database
DeliveryMethod fulltext_linktorsrc
EISSN 2307-387X
EndPage 319
ExternalDocumentID oai_doaj_org_article_12743c9b2097440b81b4f0812111a02c
10_1162_tacl_a_00550
tacl_a_00550.pdf
GroupedDBID AAFWJ
AFPKN
ALMA_UNASSIGNED_HOLDINGS
EBS
GROUPED_DOAJ
JMNJE
M~E
OJV
OK1
RMI
AAYXX
ABUWG
AFKRA
ALSLI
ARAPS
BENPR
BGLVJ
CCPQU
CITATION
CPGLG
CRLPW
DWQXO
HCIFZ
K7-
PHGZM
PHGZT
PIMPY
7T9
8FE
8FG
AZQEC
GNUQQ
JQ2
P62
PKEHL
PQEST
PQGLB
PQQKQ
PQUKI
PRQQA
PUEGO
ID FETCH-LOGICAL-c455t-51cad97f0f4dd2ca68859672a727ee4084a5e8ea11ac83114cb01299ff425a793
IEDL.DBID DOA
ISSN 2307-387X
IngestDate Wed Aug 27 01:30:46 EDT 2025
Sun Jul 13 04:06:30 EDT 2025
Thu Apr 24 23:08:40 EDT 2025
Tue Jul 01 03:28:36 EDT 2025
Fri Oct 20 12:12:54 EDT 2023
Sat Oct 21 05:18:29 EDT 2023
IsDoiOpenAccess true
IsOpenAccess true
IsPeerReviewed true
IsScholarly true
Language English
LinkModel DirectLink
MergedId FETCHMERGED-LOGICAL-c455t-51cad97f0f4dd2ca68859672a727ee4084a5e8ea11ac83114cb01299ff425a793
Notes ObjectType-Article-1
SourceType-Scholarly Journals-1
ObjectType-Feature-2
content type line 14
OpenAccessLink https://doaj.org/article/12743c9b2097440b81b4f0812111a02c
PQID 2893946865
PQPubID 6535866
PageCount 20
ParticipantIDs proquest_journals_2893946865
doaj_primary_oai_doaj_org_article_12743c9b2097440b81b4f0812111a02c
mit_journals_10_1162_tacl_a_00550
mit_journals_taclv11_347610_2023_10_20_zip_tacl_a_00550
crossref_citationtrail_10_1162_tacl_a_00550
crossref_primary_10_1162_tacl_a_00550
ProviderPackageCode CITATION
AAYXX
PublicationCentury 2000
PublicationDate 2023-03-22
PublicationDateYYYYMMDD 2023-03-22
PublicationDate_xml – month: 03
  year: 2023
  text: 2023-03-22
  day: 22
PublicationDecade 2020
PublicationPlace One Broadway, 12th Floor, Cambridge, Massachusetts 02142, USA
PublicationPlace_xml – name: One Broadway, 12th Floor, Cambridge, Massachusetts 02142, USA
– name: Cambridge
PublicationTitle Transactions of the Association for Computational Linguistics
PublicationYear 2023
Publisher MIT Press
MIT Press Journals, The
The MIT Press
Publisher_xml – name: MIT Press
– name: MIT Press Journals, The
– name: The MIT Press
References Zou (2023032118361820000_) 2017; 112
Talat (2023032118361820000_) 2016
Manzini (2023032118361820000_) 2019
Fleiss (2023032118361820000_) 1971; 76
Hutchinson (2023032118361820000_) 2020
Rasch (2023032118361820000_) 1993
Geva (2023032118361820000_) 2019
Koch (2023032118361820000_) 2016; 110
Pedregosa (2023032118361820000_) 2011; 12
Rottger (2023032118361820000_) 2022
Badjatiya (2023032118361820000_) 2019
Obermeyer (2023032118361820000_) 2019; 366
Lalor (2023032118361820000_) 2016
Nozza (2023032118361820000_) 2019
Zo (2023032118361820000_) 2022; 11
Pavlick (2023032118361820000_) 2019; 7
Kennedy (2023032118361820000_) 2018; 18
Ross (2023032118361820000_) 2017
Cuddy (2023032118361820000_) 2008; 40
Rowell Huesmann (2023032118361820000_) 2012; 22
Garg (2023032118361820000_) 2019
Mehrabi (2023032118361820000_) 2021; 54
Wolf (2023032118361820000_) 2020
Hardt (2023032118361820000_) 2016
Norton (2023032118361820000_) 2011; 6
McCradden (2023032118361820000_) 2020; 27
Gaffney (2023032118361820000_) 2018
Wagner (2023032118361820000_) 2021; 595
Cowan (2023032118361820000_) 2002; 58
Dixon (2023032118361820000_) 2018
Rajadesingan (2023032118361820000_) 2015
Carter (2023032118361820000_) 2015; 9
Hovy (2023032118361820000_) 2021; 15
Cortes (2023032118361820000_) 1995; 20
Vidgen (2023032118361820000_) 2021
Crawford (2023032118361820000_) 2017
Chuang (2023032118361820000_) 2021
Davidson (2023032118361820000_) 2019
Davani (2023032118361820000_) 2021
Prabhakaran (2023032118361820000_) 2019
Davidson (2023032118361820000_) 2017
Talat (2023032118361820000_) 2021
Caliskan (2023032118361820000_) 2017; 356
Feldman (2023032118361820000_) 2015
Gong (2023032118361820000_) 2017
Akhtar (2023032118361820000_) 2021
Prabhakaran (2023032118361820000_) 2021
Borkan (2023032118361820000_) 2019
Hovy (2023032118361820000_) 2013
Lin Blodgett (2023032118361820000_) 2020
Czarnowska (2023032118361820000_) 2021; 9
Garg (2023032118361820000_) 2018; 115
Kwok (2023032118361820000_) 2013
Kennedy (2023032118361820000_) 2020
Fiske (2023032118361820000_) 2002; 82
Hofmann (2023032118361820000_) 2005; 31
Charlesworth (2023032118361820000_) 2021; 32
Pietraszkiewicz (2023032118361820000_) 2019; 49
Gultchin (2023032118361820000_) 2019
Zhuang (2023032118361820000_) 2021
Bender (2023032118361820000_) 2021
Kennedy (2023032118361820000_) 2022; 56
Wich (2023032118361820000_) 2020
Mozafari (2023032118361820000_) 2020; 15
Aroyo (2023032118361820000_) 2019
Díaz (2023032118361820000_) 2022
Ji (2023032118361820000_) 2018
Kocoń (2023032118361820000_) 2021
Stemler (2023032118361820000_) 2021; 26
Vaidya (2023032118361820000_) 2020
Uma (2023032118361820000_) 2021; 72
Lin Blodgett (2023032118361820000_) 2017
Swinger (2023032118361820000_) 2019
Jiang (2023032118361820000_) 2021; 16
Xia (2023032118361820000_) 2020
Posch (2023032118361820000_) 2018
Bolukbasi (2023032118361820000_) 2016
Sap (2023032118361820000_) 2019
Cuddy (2023032118361820000_) 2007; 92
Patton (2023032118361820000_) 2019
Pennington (2023032118361820000_) 2014
Sap (2023032118361820000_) 2022
Devlin (2023032118361820000_) 2019
Kiritchenko (2023032118361820000_) 2021; 71
Arhin (2023032118361820000_) 2021
Davani (2023032118361820000_) 2022; 10
References_xml – start-page: 5491
  volume-title: Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics
  year: 2020
  ident: 2023032118361820000_
  article-title: Social biases in NLP models as barriers for persons with disabilities
  doi: 10.18653/v1/2020.acl-main.487
– volume: 18
  year: 2018
  ident: 2023032118361820000_
  article-title: A typology and coding manual for the study of hate-based rhetoric
  publication-title: PsyArXiv. July
– volume: 7
  start-page: 677
  year: 2019
  ident: 2023032118361820000_
  article-title: Inherent disagreements in human textual inferences
  publication-title: Transactions of the Association for Computational Linguistics
  doi: 10.1162/tacl_a_00293
– volume: 12
  start-page: 2825
  year: 2011
  ident: 2023032118361820000_
  article-title: Scikit-learn: Machine learning in Python
  publication-title: Journal of Machine Learning Research
– start-page: 175
  volume-title: Proceedings of the 2022 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies
  year: 2022
  ident: 2023032118361820000_
  article-title: Two contrasting data annotation paradigms for subjective NLP tasks
  doi: 10.18653/v1/2022.naacl-main.13
– start-page: 648
  volume-title: Proceedings of the Conference on Empirical Methods in Natural Language Processing. Conference on Empirical Methods in Natural Language Processing
  year: 2016
  ident: 2023032118361820000_
  article-title: Building an evaluation scale using item response theory
– volume-title: Proceedings of the 2022 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies
  year: 2022
  ident: 2023032118361820000_
  article-title: Annotators with attitudes: How annotator beliefs and identities bias toxic language detection
  doi: 10.18653/v1/2022.naacl-main.431
– start-page: 2342
  volume-title: 2022 ACM Conference on Fairness, Accountability, and Transparency
  year: 2022
  ident: 2023032118361820000_
  article-title: Crowdworksheets: Accounting for individual and collective identities underlying crowdsourced dataset annotation
  doi: 10.1145/3531146.3534647
– volume-title: Proceedings of the Workshop on Natural Language Processing for Computer Mediated Communication
  year: 2017
  ident: 2023032118361820000_
  article-title: Measuring the reliability of hate speech annotations: The case of the european refugee crisis
– start-page: 97
  volume-title: Proceedings of the eighth ACM international conference on web search and data mining
  year: 2015
  ident: 2023032118361820000_
  article-title: Sarcasm detection on twitter: A behavioral modeling approach
  doi: 10.1145/2684822.2685316
– volume: 15
  start-page: e12432
  issue: 8
  year: 2021
  ident: 2023032118361820000_
  article-title: Five sources of bias in natural language processing
  publication-title: Language and Linguistics Compass
  doi: 10.1111/lnc3.12432
– volume: 595
  start-page: 197
  issue: 7866
  year: 2021
  ident: 2023032118361820000_
  article-title: Measuring algorithmically infused societies
  publication-title: Nature
  doi: 10.1038/s41586-021-03666-1
– volume: 16
  start-page: e0256762
  issue: 8
  year: 2021
  ident: 2023032118361820000_
  article-title: Understanding international perceptions of the severity of harmful content online
  publication-title: PloS One
  doi: 10.1371/journal.pone.0256762
– start-page: 1161
  volume-title: Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing (EMNLP-IJCNLP)
  year: 2019
  ident: 2023032118361820000_
  article-title: Are we modeling the task or the annotator? An investigation of annotator bias in natural language understanding datasets
  doi: 10.18653/v1/D19-1107
– volume: 72
  start-page: 1385
  year: 2021
  ident: 2023032118361820000_
  article-title: Learning from disagreement: A survey
  publication-title: Journal of Artificial Intelligence Research
  doi: 10.1613/jair.1.12752
– volume-title: NAACL-HLT
  year: 2019
  ident: 2023032118361820000_
  article-title: BERT: Pre-training of deep bidirectional transformers for language understanding
– volume: 10
  start-page: 92
  year: 2022
  ident: 2023032118361820000_
  article-title: Dealing with disagreements: Looking beyond the majority vote in subjective annotations
  publication-title: Transactions of the Association for Computational Linguistics
  doi: 10.1162/tacl_a_00449
– volume: 49
  start-page: 871
  issue: 5
  year: 2019
  ident: 2023032118361820000_
  article-title: The big two dictionaries: Capturing agency and communion in natural language
  publication-title: European Journal of Social Psychology
  doi: 10.1002/ejsp.2561
– start-page: 2474
  volume-title: International Conference on Machine Learning
  year: 2019
  ident: 2023032118361820000_
  article-title: Humor in word embeddings: Cockamamie gobbledegook for nincompoops
– start-page: 5435
  volume-title: Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics
  year: 2020
  ident: 2023032118361820000_
  article-title: Contextualizing hate speech classifiers with post-hoc explanation
  doi: 10.18653/v1/2020.acl-main.483
– volume: 56
  start-page: 79
  issue: 1
  year: 2022
  ident: 2023032118361820000_
  article-title: Introducing the gab hate corpus: Defining and applying hate-based rhetoric to social media posts at scale
  publication-title: Language Resources and Evaluation
  doi: 10.1007/s10579-021-09569-x
– start-page: 1168
  volume-title: 2021 IEEE International Conference on Data Mining (ICDM)
  year: 2021
  ident: 2023032118361820000_
  article-title: Learning personal human biases and representations for subjective tasks in natural language processing
  doi: 10.1109/ICDM51629.2021.00140
– volume: 92
  start-page: 631
  issue: 4
  year: 2007
  ident: 2023032118361820000_
  article-title: The bias map: Behaviors from intergroup affect and stereotypes.
  publication-title: Journal of Personality and Social Psychology
  doi: 10.1037/0022-3514.92.4.631
– volume-title: Proceedings of the 52nd Hawaii International Conference on System Sciences
  year: 2019
  ident: 2023032118361820000_
  article-title: Annotating social media data from vulnerable populations: Evaluating disagreement between domain experts and graduate student annotators
  doi: 10.24251/HICSS.2019.260
– volume: 115
  start-page: E3635
  issue: 16
  year: 2018
  ident: 2023032118361820000_
  article-title: Word embeddings quantify 100 years of gender and ethnic stereotypes
  publication-title: Proceedings of the National Academy of Sciences
  doi: 10.1073/pnas.1720347115
– start-page: 1120
  volume-title: Proceedings of the 2013 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies
  year: 2013
  ident: 2023032118361820000_
  article-title: Learning whom to trust with mace
– start-page: 1667
  volume-title: Proceedings of the 59th Annual Meeting of the Association for Computational Linguistics and the 11th International Joint Conference on Natural Language Processing (Volume 1: Long Papers)
  year: 2021
  ident: 2023032118361820000_
  article-title: Learning from the worst: Dynamically generated datasets to improve online hate detection
  doi: 10.18653/v1/2021.acl-long.132
– volume-title: Conference on Neural Information Processing Systems, invited speaker
  year: 2017
  ident: 2023032118361820000_
  article-title: The trouble with bias
– volume: 366
  start-page: 447
  issue: 6464
  year: 2019
  ident: 2023032118361820000_
  article-title: Dissecting racial bias in an algorithm used to manage the health of populations
  publication-title: Science
  doi: 10.1126/science.aax2342
– start-page: 1532
  volume-title: Empirical Methods in Natural Language Processing (EMNLP)
  year: 2014
  ident: 2023032118361820000_
  article-title: GloVe: Global vectors for word representation
  doi: 10.3115/v1/D14-1162
– year: 2021
  ident: 2023032118361820000_
  article-title: Ground-truth, whose truth?–examining the challenges with annotating toxic text datasets
  publication-title: arXiv preprint arXiv:2112.03529
– volume: 11
  start-page: 8
  issue: 1
  year: 2022
  ident: 2023032118361820000_
  article-title: Tackling racial bias in automated online hate detection: Towards fair and accurate detection of hateful users with geometric deep learning
  publication-title: EPJ Data Science
  doi: 10.1140/epjds/s13688-022-00319-9
– volume-title: Proceedings of the 2018 Conference on Empirical Methods in Natural Language Processing
  year: 2018
  ident: 2023032118361820000_
  article-title: Reducing gender bias in abusive language detection
– volume: 71
  start-page: 431
  year: 2021
  ident: 2023032118361820000_
  article-title: Confronting abusive language online: A survey from the ethical and human rights perspective
  publication-title: Journal of Artificial Intelligence Research
  doi: 10.1613/jair.1.12590
– start-page: 67
  volume-title: Proceedings of the 2018 AAAI/ACM Conference on AI, Ethics, and Society
  year: 2018
  ident: 2023032118361820000_
  article-title: Measuring and mitigating unintended bias in text classification
  doi: 10.1145/3278721.3278729
– volume: 58
  start-page: 247
  issue: 2
  year: 2002
  ident: 2023032118361820000_
  article-title: Hate speech and constitutional protection: Priming values of equality and freedom
  publication-title: Journal of Social Issues
  doi: 10.1111/1540-4560.00259
– start-page: 191
  volume-title: Proceedings of the fourth workshop on online abuse and harms
  year: 2020
  ident: 2023032118361820000_
  article-title: Investigating annotator bias with a graph-based approach
  doi: 10.18653/v1/2020.alw-1.22
– start-page: 1100
  volume-title: Companion Proceedings of The 2019 World Wide Web Conference
  year: 2019
  ident: 2023032118361820000_
  article-title: Crowdsourcing subjective tasks: The case study of understanding toxicity in online discussions
  doi: 10.1145/3308560.3317083
– start-page: 259
  volume-title: proceedings of the 21th ACM SIGKDD international conference on knowledge discovery and data mining
  year: 2015
  ident: 2023032118361820000_
  article-title: Certifying and removing disparate impact
  doi: 10.1145/2783258.2783311
– start-page: 3315
  volume-title: Advances in Neural Information Processing Systems
  year: 2016
  ident: 2023032118361820000_
  article-title: Equality of opportunity in supervised learning
– volume: 15
  start-page: e0237861
  issue: 8
  year: 2020
  ident: 2023032118361820000_
  article-title: Hate speech detection and racial bias mitigation in social media based on bert model
  publication-title: PloS ONE
  doi: 10.1371/journal.pone.0237861
– volume: 40
  start-page: 61
  year: 2008
  ident: 2023032118361820000_
  article-title: Warmth and competence as universal dimensions of social perception: The stereotype content model and the bias map
  publication-title: Advances in Experimental Social Psychology
  doi: 10.1016/S0065-2601(07)00002-0
– start-page: 1668
  volume-title: Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics
  year: 2019
  ident: 2023032118361820000_
  article-title: The risk of racial bias in hate speech detection
  doi: 10.18653/v1/P19-1163
– start-page: 683
  volume-title: Proceedings of the International AAAI Conference on Web and Social Media
  year: 2020
  ident: 2023032118361820000_
  article-title: Empirical analysis of multi-task learning for reducing identity bias in toxic comment detection
– volume: 110
  start-page: 675
  issue: 5
  year: 2016
  ident: 2023032118361820000_
  article-title: The abc of stereotypes about groups: Agency/socioeconomic success, conservative–progressive beliefs, and communion.
  publication-title: Journal of Personality and Social Psychology
  doi: 10.1037/pspa0000046
– start-page: 7
  volume-title: Proceedings of the Eighth International Workshop on Natural Language Processing for Social Media
  year: 2020
  ident: 2023032118361820000_
  article-title: Demoting racial bias in hate speech detection
  doi: 10.18653/v1/2020.socialnlp-1.2
– start-page: 25
  volume-title: Proceedings of the Third Workshop on Abusive Language Online
  year: 2019
  ident: 2023032118361820000_
  article-title: Racial bias in hate speech and abusive language detection datasets
  doi: 10.18653/v1/W19-3504
– volume: 31
  start-page: 1369
  issue: 10
  year: 2005
  ident: 2023032118361820000_
  article-title: A meta-analysis on the correlation between the implicit association test and explicit self-report measures
  publication-title: Personality and Social Psychology Bulletin
  doi: 10.1177/0146167205275613
– volume: 26
  start-page: 11
  year: 2021
  ident: 2023032118361820000_
  article-title: Rasch measurement v. item response theory: Knowing when to cross the line.
  publication-title: Practical Assessment, Research & Evaluation
– start-page: 114
  volume-title: Proceedings of the 5th Workshop on Online Abuse and Harms (WOAH 2021)
  year: 2021
  ident: 2023032118361820000_
  article-title: Mitigating biases in toxic language detection through invariant rationalization
  doi: 10.18653/v1/2021.woah-1.12
– volume: 54
  start-page: 1
  issue: 6
  year: 2021
  ident: 2023032118361820000_
  article-title: A survey on bias and fairness in machine learning
  publication-title: ACM Computing Surveys (CSUR)
  doi: 10.1145/3457607
– volume-title: Eleventh International AAAI Conference on Web and Social Media
  year: 2018
  ident: 2023032118361820000_
  article-title: Characterizing the global crowd workforce: A cross-country comparison of crowdworker demographics
– volume: 82
  start-page: 878
  issue: 6
  year: 2002
  ident: 2023032118361820000_
  article-title: A model of (often mixed) stereotype content: Competence and warmth respectively follow from perceived status and competition.
  publication-title: Journal of Personality and Social Psychology
  doi: 10.1037/0022-3514.82.6.878
– start-page: 1218
  volume-title: Proceedings of the 20th Chinese National Conference on Computational Linguistics
  year: 2021
  ident: 2023032118361820000_
  article-title: A robustly optimized BERT pre-training approach with post-training
– volume: 32
  start-page: 218
  year: 2021
  ident: 2023032118361820000_
  article-title: Gender stereotypes in natural language: Word embeddings show robust consistency across child and adult language corpora of more than 65 million words
  publication-title: Psychological Science
  doi: 10.1177/0956797620963619
– year: 2021
  ident: 2023032118361820000_
  article-title: Whose opinions matter? Perspective-aware models to identify opinions of hate speech victims in abusive language detection
  publication-title: arXiv preprint arXiv:2106.15896
– start-page: 4349
  volume-title: Advances in Neural Information Processing Systems
  year: 2016
  ident: 2023032118361820000_
  article-title: Man is to computer programmer as woman is to homemaker? Debiasing word embeddings
– start-page: 5740
  volume-title: Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing (EMNLP-IJCNLP)
  year: 2019
  ident: 2023032118361820000_
  article-title: Perturbation sensitivity analysis to detect unintended model biases
  doi: 10.18653/v1/D19-1578
– start-page: 305
  volume-title: Proceedings of the 2019 AAAI/ACM Conference on AI, Ethics, and Society
  year: 2019
  ident: 2023032118361820000_
  article-title: What are the biases in my word embedding?
  doi: 10.1145/3306618.3314270
– start-page: 38
  volume-title: Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing: System Demonstrations
  year: 2020
  ident: 2023032118361820000_
  article-title: Transformers: State-of-the-art natural language processing
  doi: 10.18653/v1/2020.emnlp-demos.6
– start-page: 219
  volume-title: Proceedings of the 2019 AAAI/ACM Conference on AI, Ethics, and Society
  year: 2019
  ident: 2023032118361820000_
  article-title: Counterfactual fairness in text classification through robustness
  doi: 10.1145/3306618.3317950
– volume: 9
  start-page: 269
  issue: 6
  year: 2015
  ident: 2023032118361820000_
  article-title: Group-based differences in perceptions of racism: What counts, to whom, and why?
  publication-title: Social and Personality Psychology Compass
  doi: 10.1111/spc3.12181
– volume: 356
  start-page: 183
  issue: 6334
  year: 2017
  ident: 2023032118361820000_
  article-title: Semantics derived automatically from language corpora contain human-like biases
  publication-title: Science
  doi: 10.1126/science.aal4230
– volume: 76
  start-page: 378
  issue: 5
  year: 1971
  ident: 2023032118361820000_
  article-title: Measuring nominal scale agreement among many raters.
  publication-title: Psychological bulletin
  doi: 10.1037/h0031619
– year: 2021
  ident: 2023032118361820000_
  article-title: Disembodied machine learning: On the illusion of objectivity in NLP
– start-page: 133
  volume-title: Proceedings of The Joint 15th Linguistic Annotation Workshop (LAW) and 3rd Designing Meaning Representations (DMR) Workshop
  year: 2021
  ident: 2023032118361820000_
  article-title: On releasing annotator-level labels and information in datasets
  doi: 10.18653/v1/2021.law-1.14
– volume: 22
  start-page: 556
  issue: 3
  year: 2012
  ident: 2023032118361820000_
  article-title: Foreign wars and domestic prejudice: How media exposure to the Israeli-Palestinian conflict predicts ethnic stereotyping by Jewish and Arab American adolescents
  publication-title: Journal of Research on Adolescence
  doi: 10.1111/j.1532-7795.2012.00785.x
– start-page: 5454
  volume-title: Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics
  year: 2020
  ident: 2023032118361820000_
  article-title: Language (technology) is power: A critical survey of “bias” in NLP
  doi: 10.18653/v1/2020.acl-main.485
– volume: 27
  start-page: 2024
  issue: 12
  year: 2020
  ident: 2023032118361820000_
  article-title: Patient safety and quality improvement: Ethical principles for a regulatory approach to bias in healthcare machine learning
  publication-title: Journal of the American Medical Informatics Association
  doi: 10.1093/jamia/ocaa085
– volume-title: Probabilistic Models for Some Intelligence and Attainment Tests
  year: 1993
  ident: 2023032118361820000_
– start-page: 491
  volume-title: Companion Proceedings of the 2019 World Wide Web Conference
  year: 2019
  ident: 2023032118361820000_
  article-title: Nuanced metrics for measuring unintended bias with real data for text classification
  doi: 10.1145/3308560.3317593
– volume-title: Eleventh international aaai conference on web and social media
  year: 2017
  ident: 2023032118361820000_
  article-title: Automated hate speech detection and the problem of offensive language
  doi: 10.1609/icwsm.v11i1.14955
– start-page: 149
  volume-title: IEEE/WIC/ACM International Conference on Web Intelligence
  year: 2019
  ident: 2023032118361820000_
  article-title: Unintended bias in misogyny detection
  doi: 10.1145/3350546.3352512
– start-page: 92
  volume-title: Proceedings of the 5th Workshop on Online Abuse and Harms (WOAH 2021)
  year: 2021
  ident: 2023032118361820000_
  article-title: Improving counterfactual generation for fair hate speech detection
  doi: 10.18653/v1/2021.woah-1.10
– start-page: 138
  volume-title: Proceedings of the first workshop on NLP and computational social science
  year: 2016
  ident: 2023032118361820000_
  article-title: Are you a racist or am I seeing things? Annotator influence on hate speech detection on twitter
– volume-title: Fairness, Accountability, and Transparency in Machine Learning (FAT/ML) Workshop, KDD
  year: 2017
  ident: 2023032118361820000_
  article-title: Racial disparity in natural language processing: A case study of social media African American English
– volume-title: Proceedings of the AAAI Conference on Artificial Intelligence
  year: 2013
  ident: 2023032118361820000_
  article-title: Locate the hate: Detecting tweets against blacks
  doi: 10.1609/aaai.v27i1.8539
– volume: 20
  start-page: 273
  issue: 3
  year: 1995
  ident: 2023032118361820000_
  article-title: Support-vector networks
  publication-title: Machine Learning
  doi: 10.1007/BF00994018
– start-page: 937
  volume-title: Proceedings of the 26th International Conference on World Wide Web
  year: 2017
  ident: 2023032118361820000_
  article-title: Clustered model adaption for personalized sentiment analysis
  doi: 10.1145/3038912.3052693
– volume: 6
  start-page: 215
  issue: 3
  year: 2011
  ident: 2023032118361820000_
  article-title: Whites see racism as a zero-sum game that they are now losing
  publication-title: Perspectives on Psychological Science
  doi: 10.1177/1745691611406922
– start-page: 610
  volume-title: Proceedings of the 2021 ACM Conference on Fairness, Accountability, and Transparency
  year: 2021
  ident: 2023032118361820000_
  article-title: On the dangers of stochastic parrots: Can language models be too big?
  doi: 10.1145/3442188.3445922
– volume: 9
  start-page: 1249
  year: 2021
  ident: 2023032118361820000_
  article-title: Quantifying social biases in NLP: A generalization and empirical comparison of extrinsic fairness metrics
  publication-title: Transactions of the Association for Computational Linguistics
  doi: 10.1162/tacl_a_00425
– start-page: 615
  volume-title: Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long and Short Papers)
  year: 2019
  ident: 2023032118361820000_
  article-title: Black is to criminal as caucasian is to police: Detecting and removing multiclass bias in word embeddings
  doi: 10.18653/v1/N19-1062
– year: 2018
  ident: 2023032118361820000_
  article-title: Pushshift gab corpus
– start-page: 49
  volume-title: The World Wide Web Conference
  year: 2019
  ident: 2023032118361820000_
  article-title: Stereotypical bias removal for hate speech detection task using knowledge-based generalizations
  doi: 10.1145/3308558.3313504
– volume: 112
  start-page: 696
  issue: 5
  year: 2017
  ident: 2023032118361820000_
  article-title: Two axes of subordination: A new model of racial position.
  publication-title: Journal of Personality and Social Psychology
  doi: 10.1037/pspa0000080
SSID ssj0001818062
Score 2.4660392
Snippet Social stereotypes negatively impact individuals’ judgments about different groups and may have a critical role in understanding language directed toward...
AbstractSocial stereotypes negatively impact individuals’ judgments about different groups and may have a critical role in understanding language directed...
SourceID doaj
proquest
crossref
mit
SourceType Open Website
Aggregation Database
Enrichment Source
Index Database
Publisher
StartPage 300
SubjectTerms Annotations
Classifiers
Computational linguistics
English language
Hate speech
Machine learning
Social exclusion
Social factors
Stereotypes
SummonAdditionalLinks – databaseName: ProQuest Technology Collection
  dbid: 8FG
  link: http://utb.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwfV3fS8MwEA46X3wRRcXplAr6JGVtmnaNL6LiHIJ7mYO9hSRNdDC3utU9-Nd7l3WbU-ZDKTSXFi4_7vuulztCLhTnmQqp9LmW3GeWBT4Hs-cDFQFrZblWCg8KP7eTVpc99eJe6XCblGGV8z3RbdTZSKOPvA7EIOIsSZP4Jv_wsWoU_l0tS2hskq0QLA3O87T5uPSx4EFmV1MUw50xjWxvHvue0Hoh9UBIgVmoghWr5JL3g6157xd_dmhndpq7ZKfEi97tbID3yIYZ7pPrFiBEr5Mbo988V9ayb7GkteeypXrtGQ6dgojziHsdUJ4Zobd1ckC6zYeX-5ZfFkHwNYvjwo9DLTPesIFlWUa1TNI05kmDSgAexrAgZTI2qZFhKHUaAbvRCn1L3FpYjRJW3yGpDEdDc0S8lMcygrcmcLFIMWkVoC0QBxpMNQ-r5GquBKHLDOFYqGIgHFNIqPipsiq5XEjns8wYa-TuUJ8LGcxn7R6Mxq-iXB4iBHIcaa5owDFjoQIwzSygFaCnoQyorpJzGA1Rrq_Jmg81VmSwbQr0JmINwIgC68QLdxdf_fxXz9p8lJfdlxPu-P_mE7KN78bANEprpFKMP80pIJVCnbnp-A0kXeNH
  priority: 102
  providerName: ProQuest
Title Hate Speech Classifiers Learn Normative Social Stereotypes
URI https://direct.mit.edu/tacl/article/doi/10.1162/tacl_a_00550
https://www.proquest.com/docview/2893946865
https://doaj.org/article/12743c9b2097440b81b4f0812111a02c
Volume 11
hasFullText 1
inHoldings 1
isFullTextHit
isPrint
link http://utb.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwrV1BS8MwFA46L15EUXE6RwU9SVmbpmnjzcnmEBziHOwWkjTBwdyGqzv4631Ju7kpw4uHUmhf2uS9pu_7Hsl7CF1KxjIZYuEzJZhPDAl8Bm7PByoC3sowJaXdKPzYpZ0-eRjEg5VSX3ZNWJEeuFBcIwTaFCkmccBsLjsJMIsY8GNAXEIRYGX_vgELVsiUi67YLcwUL1a6U9zIhRpxwW3OqWDNB7lU_eBZ3ob5r_-xczLtfbRXokPvtujVAdrS40N00wE86PWmWqtXzxWxHBpbwNpzuVG9boE65yDi4t9eD1SlJza2OjtC_Xbr5a7jlyUPfEXiOPfjUImMJSYwJMuwEjRNY0YTLABmaE2ClIhYp1rA2FUaAZdR0kaSmDEw9wTMtWNUGU_G-gR5KYtFBE-lcJBIEmEkYCsQB9KLFQur6HqhBK7KfOC2LMWIO15AMV9VWRVdLaWnRR6MDXJNq8-ljM1e7S6ATXlpU_6XTavoAqzBy9k02_CiZE3G3psDmYlIAoiQ26rw3J3553D6o2VtYeXv5sA7I0ZoSuPT_xjBGdq1PbCL1TCuoUr-_qHPAb3kso620_Z9He00W92n57r7bL8A2nPqIQ
linkProvider Directory of Open Access Journals
linkToHtml http://utb.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwtV3fb9MwED5N7QO8ICZAlI0tk9gTipo4zg9PmqYNNnV0qxDbpL4Z27Gh0mjLmhXBH8XfyJ2brBtovO0hihRfYsk---67nO8DeKOFKHXMVCiMEiF3PAoFmr0QoQhaKyeM1nRQ-HSQ9S74h2E6XIHfzVkYSqts9kS_UZcTQzHyLgKDRPCsyNK96feQWKPo72pDobFQi779-QMh22z3-D3O7zZjR4fn73phzSoQGp6mVZjGRpUid5HjZcmMyooiFVnOFFpya3lUcJXawqo4VqZIEC4YTcEa4Ryqt8qp-BJu-W1OJ1pb0D44HHz8tIzq0NFpz2JKCdZUuHbYZNtnrFspcymVpLpX0R076OkC0Lp9G1X_2ARv6I6ewpPaQw32Fyq1Cit2_Ax2euiTBmdTa83XwBNpjhyRaAe-PmswWHi-cxTxMfjgDKfLTii-O3sOFw8yQC-gNZ6M7UsICpGqBL-a4cUTzZXT6N-hOAJvZkTcgbfNIEhT1yQnaoxL6bFJxuTtIevA9o30dFGL4x65AxrPGxmqoO0fTK6-yHpByhjheGKEZpGgGoka3Xfu0D9CQByriJkObOFsyHpFz-7pKL8jQ21zBFQJz9ErlcRML_1d_hpN_3pzvZnl5etLFX_1_-ZNeNQ7Pz2RJ8eD_ho8pn4oLY6xdWhVV9f2NfpJld6olTOAzw-9Hv4A2mohIQ
openUrl ctx_ver=Z39.88-2004&ctx_enc=info%3Aofi%2Fenc%3AUTF-8&rfr_id=info%3Asid%2Fsummon.serialssolutions.com&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Ajournal&rft.genre=article&rft.atitle=Hate+Speech+Classifiers+Learn+Normative+Social+Stereotypes&rft.jtitle=Transactions+of+the+Association+for+Computational+Linguistics&rft.au=Davani%2C+Aida+Mostafazadeh&rft.au=Atari%2C+Mohammad&rft.au=Kennedy%2C+Brendan&rft.au=Dehghani%2C+Morteza&rft.date=2023-03-22&rft.pub=MIT+Press&rft.eissn=2307-387X&rft.volume=11&rft.spage=300&rft.epage=319&rft_id=info:doi/10.1162%2Ftacl_a_00550&rft.externalDBID=n%2Fa&rft.externalDocID=tacl_a_00550.pdf
thumbnail_l http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/lc.gif&issn=2307-387X&client=summon
thumbnail_m http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/mc.gif&issn=2307-387X&client=summon
thumbnail_s http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/sc.gif&issn=2307-387X&client=summon