Hate Speech Classifiers Learn Normative Social Stereotypes

Social stereotypes negatively impact individuals’ judgments about different groups and may have a critical role in understanding language directed toward marginalized groups. Here, we assess the role of social stereotypes in the automated detection of hate speech in the English language by examining...

Full description

Saved in:

Bibliographic Details
Published in	Transactions of the Association for Computational Linguistics Vol. 11; pp. 300 - 319
Main Authors	Davani, Aida Mostafazadeh, Atari, Mohammad, Kennedy, Brendan, Dehghani, Morteza
Format	Journal Article
Language	English
Published	One Broadway, 12th Floor, Cambridge, Massachusetts 02142, USA MIT Press 22.03.2023 MIT Press Journals, The The MIT Press
Subjects	Annotations Classifiers Computational linguistics English language Hate speech Machine learning Social exclusion Social factors Stereotypes
Online Access	Get full text
ISSN	2307-387X 2307-387X
DOI	10.1162/tacl_a_00550

Cover

Abstract	Social stereotypes negatively impact individuals’ judgments about different groups and may have a critical role in understanding language directed toward marginalized groups. Here, we assess the role of social stereotypes in the automated detection of hate speech in the English language by examining the impact of social stereotypes on annotation behaviors, annotated datasets, and hate speech classifiers. Specifically, we first investigate the impact of novice annotators’ stereotypes on their hate-speech-annotation behavior. Then, we examine the effect of normative stereotypes in language on the aggregated annotators’ judgments in a large annotated corpus. Finally, we demonstrate how normative stereotypes embedded in language resources are associated with systematic prediction errors in a hate-speech classifier. The results demonstrate that hate-speech classifiers reflect social stereotypes against marginalized groups, which can perpetuate social inequalities when propagated at scale. This framework, combining social-psychological and computational-linguistic methods, provides insights into sources of bias in hate-speech moderation, informing ongoing debates regarding machine learning fairness.
AbstractList	AbstractSocial stereotypes negatively impact individuals’ judgments about different groups and may have a critical role in understanding language directed toward marginalized groups. Here, we assess the role of social stereotypes in the automated detection of hate speech in the English language by examining the impact of social stereotypes on annotation behaviors, annotated datasets, and hate speech classifiers. Specifically, we first investigate the impact of novice annotators’ stereotypes on their hate-speech-annotation behavior. Then, we examine the effect of normative stereotypes in language on the aggregated annotators’ judgments in a large annotated corpus. Finally, we demonstrate how normative stereotypes embedded in language resources are associated with systematic prediction errors in a hate-speech classifier. The results demonstrate that hate-speech classifiers reflect social stereotypes against marginalized groups, which can perpetuate social inequalities when propagated at scale. This framework, combining social-psychological and computational-linguistic methods, provides insights into sources of bias in hate-speech moderation, informing ongoing debates regarding machine learning fairness. Social stereotypes negatively impact individuals’ judgments about different groups and may have a critical role in understanding language directed toward marginalized groups. Here, we assess the role of social stereotypes in the automated detection of hate speech in the English language by examining the impact of social stereotypes on annotation behaviors, annotated datasets, and hate speech classifiers. Specifically, we first investigate the impact of novice annotators’ stereotypes on their hate-speech-annotation behavior. Then, we examine the effect of normative stereotypes in language on the aggregated annotators’ judgments in a large annotated corpus. Finally, we demonstrate how normative stereotypes embedded in language resources are associated with systematic prediction errors in a hate-speech classifier. The results demonstrate that hate-speech classifiers reflect social stereotypes against marginalized groups, which can perpetuate social inequalities when propagated at scale. This framework, combining social-psychological and computational-linguistic methods, provides insights into sources of bias in hate-speech moderation, informing ongoing debates regarding machine learning fairness.
Author	Dehghani, Morteza Davani, Aida Mostafazadeh Kennedy, Brendan Atari, Mohammad
Author_xml	– sequence: 1 givenname: Aida Mostafazadeh surname: Davani fullname: Davani, Aida Mostafazadeh organization: University of Southern California, USA. mostafaz@usc.edu – sequence: 2 givenname: Mohammad surname: Atari fullname: Atari, Mohammad email: atari@usc.edu organization: University of Southern California, USA. atari@usc.edu – sequence: 3 givenname: Brendan surname: Kennedy fullname: Kennedy, Brendan organization: University of Southern California, USA. btkenned@usc.edu – sequence: 4 givenname: Morteza surname: Dehghani fullname: Dehghani, Morteza organization: University of Southern California, USA. mdehghan@usc.edu
BookMark	eNp1kFFrFDEUhYNUsNa--QMGfPHB1XszySTxQZBFbWHRhyr4Fu5m72iW2cmYpIX21zvtKmzFPt0QvvNxOE_F0ZhGFuI5wmvETr6pFAZPHkBreCSOZQtm0Vrz_ejg_USclrIFALRooZPH4u0ZVW4uJubws1kOVErsI-fSrJjy2HxOeUc1Xs1ICpGG5qJy5lSvJy7PxOOehsKnf-6J-Pbxw9fl2WL15dP58v1qEZTWdaEx0MaZHnq12chAnbXadUaSkYZZgVWk2TIhUrAtogprQOlc3yupybj2RJzvvZtEWz_luKN87RNFf_eR8g9PucYwsEdpVBvcWoIzSsHa4lr1YFHibAcZZteLvWvK6dcll-q36TKPc30vrWud6mynZ0ruqZBTKZl7H2KdZ0hjzRQHj-BvJ_eHk8-hV_-E_lZ9AH-5x3fxoMQD6Lv_oLfIFaJvlenmmATZ-rvrb-J0X_Ab6Z6ksA
CitedBy_id	crossref_primary_10_1007_s00146_024_01945_9 crossref_primary_10_7592_Tertium_2023_8_2_245 crossref_primary_10_1007_s00521_024_10841_8 crossref_primary_10_1016_j_eswa_2025_127188 crossref_primary_10_1109_ACCESS_2024_3360306 crossref_primary_10_1017_nlp_2024_18 crossref_primary_10_31637_epsir_2024_522 crossref_primary_10_1007_s11943_023_00332_y
Cites_doi	10.18653/v1/2020.acl-main.487 10.1162/tacl_a_00293 10.18653/v1/2022.naacl-main.13 10.18653/v1/2022.naacl-main.431 10.1145/3531146.3534647 10.1145/2684822.2685316 10.1111/lnc3.12432 10.1038/s41586-021-03666-1 10.1371/journal.pone.0256762 10.18653/v1/D19-1107 10.1613/jair.1.12752 10.1162/tacl_a_00449 10.1002/ejsp.2561 10.18653/v1/2020.acl-main.483 10.1007/s10579-021-09569-x 10.1109/ICDM51629.2021.00140 10.1037/0022-3514.92.4.631 10.24251/HICSS.2019.260 10.1073/pnas.1720347115 10.18653/v1/2021.acl-long.132 10.1126/science.aax2342 10.3115/v1/D14-1162 10.1140/epjds/s13688-022-00319-9 10.1613/jair.1.12590 10.1145/3278721.3278729 10.1111/1540-4560.00259 10.18653/v1/2020.alw-1.22 10.1145/3308560.3317083 10.1145/2783258.2783311 10.1371/journal.pone.0237861 10.1016/S0065-2601(07)00002-0 10.18653/v1/P19-1163 10.1037/pspa0000046 10.18653/v1/2020.socialnlp-1.2 10.18653/v1/W19-3504 10.1177/0146167205275613 10.18653/v1/2021.woah-1.12 10.1145/3457607 10.1037/0022-3514.82.6.878 10.1177/0956797620963619 10.18653/v1/D19-1578 10.1145/3306618.3314270 10.18653/v1/2020.emnlp-demos.6 10.1145/3306618.3317950 10.1111/spc3.12181 10.1126/science.aal4230 10.1037/h0031619 10.18653/v1/2021.law-1.14 10.1111/j.1532-7795.2012.00785.x 10.18653/v1/2020.acl-main.485 10.1093/jamia/ocaa085 10.1145/3308560.3317593 10.1609/icwsm.v11i1.14955 10.1145/3350546.3352512 10.18653/v1/2021.woah-1.10 10.1609/aaai.v27i1.8539 10.1007/BF00994018 10.1145/3038912.3052693 10.1177/1745691611406922 10.1145/3442188.3445922 10.1162/tacl_a_00425 10.18653/v1/N19-1062 10.1145/3308558.3313504 10.1037/pspa0000080
ContentType	Journal Article
Copyright	2023. This work is published under https://creativecommons.org/licenses/by/4.0/legalcode (the “License”). Notwithstanding the ProQuest Terms and Conditions, you may use this content in accordance with the terms of the License.
Copyright_xml	– notice: 2023. This work is published under https://creativecommons.org/licenses/by/4.0/legalcode (the “License”). Notwithstanding the ProQuest Terms and Conditions, you may use this content in accordance with the terms of the License.
DBID	AAYXX CITATION 7T9 8FE 8FG ABUWG AFKRA ALSLI ARAPS AZQEC BENPR BGLVJ CCPQU CPGLG CRLPW DWQXO GNUQQ HCIFZ JQ2 K7- P5Z P62 PHGZM PHGZT PIMPY PKEHL PQEST PQGLB PQQKQ PQUKI PRQQA DOA
DOI	10.1162/tacl_a_00550
DatabaseName	CrossRef Linguistics and Language Behavior Abstracts (LLBA) ProQuest SciTech Collection ProQuest Technology Collection ProQuest Central ProQuest Central UK/Ireland Social Science Premium Collection Advanced Technologies & Aerospace Collection ProQuest Central Essentials ProQuest Central Technology Collection (via ProQuest SciTech Premium Collection) ProQuest One Community College Linguistics Collection Linguistics Database ProQuest Central Korea ProQuest Central Student ProQuest SciTech Premium Collection ProQuest Computer Science Collection Computer Science Database Advanced Technologies & Aerospace Database ProQuest Advanced Technologies & Aerospace Collection Proquest Central Premium ProQuest One Academic (New) Publicly Available Content Database ProQuest One Academic Middle East (New) ProQuest One Academic Eastern Edition (DO NOT USE) ProQuest One Applied & Life Sciences ProQuest One Academic ProQuest One Academic UKI Edition ProQuest One Social Sciences DOAJ Directory of Open Access Journals
DatabaseTitle	CrossRef Publicly Available Content Database Computer Science Database ProQuest Central Student Technology Collection ProQuest One Academic Middle East (New) ProQuest Advanced Technologies & Aerospace Collection ProQuest Central Essentials ProQuest Computer Science Collection ProQuest Central (Alumni Edition) SciTech Premium Collection ProQuest One Community College ProQuest Central ProQuest One Applied & Life Sciences Linguistics Collection ProQuest Central Korea ProQuest Central (New) Advanced Technologies & Aerospace Collection Social Science Premium Collection ProQuest One Social Sciences ProQuest One Academic Eastern Edition Linguistics and Language Behavior Abstracts (LLBA) ProQuest Technology Collection ProQuest SciTech Collection Advanced Technologies & Aerospace Database ProQuest One Academic UKI Edition Linguistics Database ProQuest One Academic ProQuest One Academic (New)
DatabaseTitleList	CrossRef Publicly Available Content Database
Database_xml	– sequence: 1 dbid: DOA name: DOAJ Directory of Open Access Journals url: https://www.doaj.org/ sourceTypes: Open Website – sequence: 2 dbid: 8FG name: ProQuest Technology Collection url: https://search.proquest.com/technologycollection1 sourceTypes: Aggregation Database
DeliveryMethod	fulltext_linktorsrc
EISSN	2307-387X
EndPage	319
ExternalDocumentID	oai_doaj_org_article_12743c9b2097440b81b4f0812111a02c 10_1162_tacl_a_00550 tacl_a_00550.pdf
GroupedDBID	AAFWJ AFPKN ALMA_UNASSIGNED_HOLDINGS EBS GROUPED_DOAJ JMNJE M~E OJV OK1 RMI AAYXX ABUWG AFKRA ALSLI ARAPS BENPR BGLVJ CCPQU CITATION CPGLG CRLPW DWQXO HCIFZ K7- PHGZM PHGZT PIMPY 7T9 8FE 8FG AZQEC GNUQQ JQ2 P62 PKEHL PQEST PQGLB PQQKQ PQUKI PRQQA PUEGO
ID	FETCH-LOGICAL-c455t-51cad97f0f4dd2ca68859672a727ee4084a5e8ea11ac83114cb01299ff425a793
IEDL.DBID	DOA
ISSN	2307-387X
IngestDate	Wed Aug 27 01:30:46 EDT 2025 Sun Jul 13 04:06:30 EDT 2025 Thu Apr 24 23:08:40 EDT 2025 Tue Jul 01 03:28:36 EDT 2025 Fri Oct 20 12:12:54 EDT 2023 Sat Oct 21 05:18:29 EDT 2023
IsDoiOpenAccess	true
IsOpenAccess	true
IsPeerReviewed	true
IsScholarly	true
Language	English
LinkModel	DirectLink
MergedId	FETCHMERGED-LOGICAL-c455t-51cad97f0f4dd2ca68859672a727ee4084a5e8ea11ac83114cb01299ff425a793
Notes	ObjectType-Article-1 SourceType-Scholarly Journals-1 ObjectType-Feature-2 content type line 14
OpenAccessLink	https://doaj.org/article/12743c9b2097440b81b4f0812111a02c
PQID	2893946865
PQPubID	6535866
PageCount	20
ParticipantIDs	proquest_journals_2893946865 doaj_primary_oai_doaj_org_article_12743c9b2097440b81b4f0812111a02c mit_journals_10_1162_tacl_a_00550 mit_journals_taclv11_347610_2023_10_20_zip_tacl_a_00550 crossref_citationtrail_10_1162_tacl_a_00550 crossref_primary_10_1162_tacl_a_00550
ProviderPackageCode	CITATION AAYXX
PublicationCentury	2000
PublicationDate	2023-03-22
PublicationDateYYYYMMDD	2023-03-22
PublicationDate_xml	– month: 03 year: 2023 text: 2023-03-22 day: 22
PublicationDecade	2020
PublicationPlace	One Broadway, 12th Floor, Cambridge, Massachusetts 02142, USA
PublicationPlace_xml	– name: One Broadway, 12th Floor, Cambridge, Massachusetts 02142, USA – name: Cambridge
PublicationTitle	Transactions of the Association for Computational Linguistics
PublicationYear	2023
Publisher	MIT Press MIT Press Journals, The The MIT Press
Publisher_xml	– name: MIT Press – name: MIT Press Journals, The – name: The MIT Press
References	Zou (2023032118361820000_) 2017; 112 Talat (2023032118361820000_) 2016 Manzini (2023032118361820000_) 2019 Fleiss (2023032118361820000_) 1971; 76 Hutchinson (2023032118361820000_) 2020 Rasch (2023032118361820000_) 1993 Geva (2023032118361820000_) 2019 Koch (2023032118361820000_) 2016; 110 Pedregosa (2023032118361820000_) 2011; 12 Rottger (2023032118361820000_) 2022 Badjatiya (2023032118361820000_) 2019 Obermeyer (2023032118361820000_) 2019; 366 Lalor (2023032118361820000_) 2016 Nozza (2023032118361820000_) 2019 Zo (2023032118361820000_) 2022; 11 Pavlick (2023032118361820000_) 2019; 7 Kennedy (2023032118361820000_) 2018; 18 Ross (2023032118361820000_) 2017 Cuddy (2023032118361820000_) 2008; 40 Rowell Huesmann (2023032118361820000_) 2012; 22 Garg (2023032118361820000_) 2019 Mehrabi (2023032118361820000_) 2021; 54 Wolf (2023032118361820000_) 2020 Hardt (2023032118361820000_) 2016 Norton (2023032118361820000_) 2011; 6 McCradden (2023032118361820000_) 2020; 27 Gaffney (2023032118361820000_) 2018 Wagner (2023032118361820000_) 2021; 595 Cowan (2023032118361820000_) 2002; 58 Dixon (2023032118361820000_) 2018 Rajadesingan (2023032118361820000_) 2015 Carter (2023032118361820000_) 2015; 9 Hovy (2023032118361820000_) 2021; 15 Cortes (2023032118361820000_) 1995; 20 Vidgen (2023032118361820000_) 2021 Crawford (2023032118361820000_) 2017 Chuang (2023032118361820000_) 2021 Davidson (2023032118361820000_) 2019 Davani (2023032118361820000_) 2021 Prabhakaran (2023032118361820000_) 2019 Davidson (2023032118361820000_) 2017 Talat (2023032118361820000_) 2021 Caliskan (2023032118361820000_) 2017; 356 Feldman (2023032118361820000_) 2015 Gong (2023032118361820000_) 2017 Akhtar (2023032118361820000_) 2021 Prabhakaran (2023032118361820000_) 2021 Borkan (2023032118361820000_) 2019 Hovy (2023032118361820000_) 2013 Lin Blodgett (2023032118361820000_) 2020 Czarnowska (2023032118361820000_) 2021; 9 Garg (2023032118361820000_) 2018; 115 Kwok (2023032118361820000_) 2013 Kennedy (2023032118361820000_) 2020 Fiske (2023032118361820000_) 2002; 82 Hofmann (2023032118361820000_) 2005; 31 Charlesworth (2023032118361820000_) 2021; 32 Pietraszkiewicz (2023032118361820000_) 2019; 49 Gultchin (2023032118361820000_) 2019 Zhuang (2023032118361820000_) 2021 Bender (2023032118361820000_) 2021 Kennedy (2023032118361820000_) 2022; 56 Wich (2023032118361820000_) 2020 Mozafari (2023032118361820000_) 2020; 15 Aroyo (2023032118361820000_) 2019 Díaz (2023032118361820000_) 2022 Ji (2023032118361820000_) 2018 Kocoń (2023032118361820000_) 2021 Stemler (2023032118361820000_) 2021; 26 Vaidya (2023032118361820000_) 2020 Uma (2023032118361820000_) 2021; 72 Lin Blodgett (2023032118361820000_) 2017 Swinger (2023032118361820000_) 2019 Jiang (2023032118361820000_) 2021; 16 Xia (2023032118361820000_) 2020 Posch (2023032118361820000_) 2018 Bolukbasi (2023032118361820000_) 2016 Sap (2023032118361820000_) 2019 Cuddy (2023032118361820000_) 2007; 92 Patton (2023032118361820000_) 2019 Pennington (2023032118361820000_) 2014 Sap (2023032118361820000_) 2022 Devlin (2023032118361820000_) 2019 Kiritchenko (2023032118361820000_) 2021; 71 Arhin (2023032118361820000_) 2021 Davani (2023032118361820000_) 2022; 10
References_xml	– start-page: 5491 volume-title: Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics year: 2020 ident: 2023032118361820000_ article-title: Social biases in NLP models as barriers for persons with disabilities doi: 10.18653/v1/2020.acl-main.487 – volume: 18 year: 2018 ident: 2023032118361820000_ article-title: A typology and coding manual for the study of hate-based rhetoric publication-title: PsyArXiv. July – volume: 7 start-page: 677 year: 2019 ident: 2023032118361820000_ article-title: Inherent disagreements in human textual inferences publication-title: Transactions of the Association for Computational Linguistics doi: 10.1162/tacl_a_00293 – volume: 12 start-page: 2825 year: 2011 ident: 2023032118361820000_ article-title: Scikit-learn: Machine learning in Python publication-title: Journal of Machine Learning Research – start-page: 175 volume-title: Proceedings of the 2022 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies year: 2022 ident: 2023032118361820000_ article-title: Two contrasting data annotation paradigms for subjective NLP tasks doi: 10.18653/v1/2022.naacl-main.13 – start-page: 648 volume-title: Proceedings of the Conference on Empirical Methods in Natural Language Processing. Conference on Empirical Methods in Natural Language Processing year: 2016 ident: 2023032118361820000_ article-title: Building an evaluation scale using item response theory – volume-title: Proceedings of the 2022 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies year: 2022 ident: 2023032118361820000_ article-title: Annotators with attitudes: How annotator beliefs and identities bias toxic language detection doi: 10.18653/v1/2022.naacl-main.431 – start-page: 2342 volume-title: 2022 ACM Conference on Fairness, Accountability, and Transparency year: 2022 ident: 2023032118361820000_ article-title: Crowdworksheets: Accounting for individual and collective identities underlying crowdsourced dataset annotation doi: 10.1145/3531146.3534647 – volume-title: Proceedings of the Workshop on Natural Language Processing for Computer Mediated Communication year: 2017 ident: 2023032118361820000_ article-title: Measuring the reliability of hate speech annotations: The case of the european refugee crisis – start-page: 97 volume-title: Proceedings of the eighth ACM international conference on web search and data mining year: 2015 ident: 2023032118361820000_ article-title: Sarcasm detection on twitter: A behavioral modeling approach doi: 10.1145/2684822.2685316 – volume: 15 start-page: e12432 issue: 8 year: 2021 ident: 2023032118361820000_ article-title: Five sources of bias in natural language processing publication-title: Language and Linguistics Compass doi: 10.1111/lnc3.12432 – volume: 595 start-page: 197 issue: 7866 year: 2021 ident: 2023032118361820000_ article-title: Measuring algorithmically infused societies publication-title: Nature doi: 10.1038/s41586-021-03666-1 – volume: 16 start-page: e0256762 issue: 8 year: 2021 ident: 2023032118361820000_ article-title: Understanding international perceptions of the severity of harmful content online publication-title: PloS One doi: 10.1371/journal.pone.0256762 – start-page: 1161 volume-title: Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing (EMNLP-IJCNLP) year: 2019 ident: 2023032118361820000_ article-title: Are we modeling the task or the annotator? An investigation of annotator bias in natural language understanding datasets doi: 10.18653/v1/D19-1107 – volume: 72 start-page: 1385 year: 2021 ident: 2023032118361820000_ article-title: Learning from disagreement: A survey publication-title: Journal of Artificial Intelligence Research doi: 10.1613/jair.1.12752 – volume-title: NAACL-HLT year: 2019 ident: 2023032118361820000_ article-title: BERT: Pre-training of deep bidirectional transformers for language understanding – volume: 10 start-page: 92 year: 2022 ident: 2023032118361820000_ article-title: Dealing with disagreements: Looking beyond the majority vote in subjective annotations publication-title: Transactions of the Association for Computational Linguistics doi: 10.1162/tacl_a_00449 – volume: 49 start-page: 871 issue: 5 year: 2019 ident: 2023032118361820000_ article-title: The big two dictionaries: Capturing agency and communion in natural language publication-title: European Journal of Social Psychology doi: 10.1002/ejsp.2561 – start-page: 2474 volume-title: International Conference on Machine Learning year: 2019 ident: 2023032118361820000_ article-title: Humor in word embeddings: Cockamamie gobbledegook for nincompoops – start-page: 5435 volume-title: Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics year: 2020 ident: 2023032118361820000_ article-title: Contextualizing hate speech classifiers with post-hoc explanation doi: 10.18653/v1/2020.acl-main.483 – volume: 56 start-page: 79 issue: 1 year: 2022 ident: 2023032118361820000_ article-title: Introducing the gab hate corpus: Defining and applying hate-based rhetoric to social media posts at scale publication-title: Language Resources and Evaluation doi: 10.1007/s10579-021-09569-x – start-page: 1168 volume-title: 2021 IEEE International Conference on Data Mining (ICDM) year: 2021 ident: 2023032118361820000_ article-title: Learning personal human biases and representations for subjective tasks in natural language processing doi: 10.1109/ICDM51629.2021.00140 – volume: 92 start-page: 631 issue: 4 year: 2007 ident: 2023032118361820000_ article-title: The bias map: Behaviors from intergroup affect and stereotypes. publication-title: Journal of Personality and Social Psychology doi: 10.1037/0022-3514.92.4.631 – volume-title: Proceedings of the 52nd Hawaii International Conference on System Sciences year: 2019 ident: 2023032118361820000_ article-title: Annotating social media data from vulnerable populations: Evaluating disagreement between domain experts and graduate student annotators doi: 10.24251/HICSS.2019.260 – volume: 115 start-page: E3635 issue: 16 year: 2018 ident: 2023032118361820000_ article-title: Word embeddings quantify 100 years of gender and ethnic stereotypes publication-title: Proceedings of the National Academy of Sciences doi: 10.1073/pnas.1720347115 – start-page: 1120 volume-title: Proceedings of the 2013 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies year: 2013 ident: 2023032118361820000_ article-title: Learning whom to trust with mace – start-page: 1667 volume-title: Proceedings of the 59th Annual Meeting of the Association for Computational Linguistics and the 11th International Joint Conference on Natural Language Processing (Volume 1: Long Papers) year: 2021 ident: 2023032118361820000_ article-title: Learning from the worst: Dynamically generated datasets to improve online hate detection doi: 10.18653/v1/2021.acl-long.132 – volume-title: Conference on Neural Information Processing Systems, invited speaker year: 2017 ident: 2023032118361820000_ article-title: The trouble with bias – volume: 366 start-page: 447 issue: 6464 year: 2019 ident: 2023032118361820000_ article-title: Dissecting racial bias in an algorithm used to manage the health of populations publication-title: Science doi: 10.1126/science.aax2342 – start-page: 1532 volume-title: Empirical Methods in Natural Language Processing (EMNLP) year: 2014 ident: 2023032118361820000_ article-title: GloVe: Global vectors for word representation doi: 10.3115/v1/D14-1162 – year: 2021 ident: 2023032118361820000_ article-title: Ground-truth, whose truth?–examining the challenges with annotating toxic text datasets publication-title: arXiv preprint arXiv:2112.03529 – volume: 11 start-page: 8 issue: 1 year: 2022 ident: 2023032118361820000_ article-title: Tackling racial bias in automated online hate detection: Towards fair and accurate detection of hateful users with geometric deep learning publication-title: EPJ Data Science doi: 10.1140/epjds/s13688-022-00319-9 – volume-title: Proceedings of the 2018 Conference on Empirical Methods in Natural Language Processing year: 2018 ident: 2023032118361820000_ article-title: Reducing gender bias in abusive language detection – volume: 71 start-page: 431 year: 2021 ident: 2023032118361820000_ article-title: Confronting abusive language online: A survey from the ethical and human rights perspective publication-title: Journal of Artificial Intelligence Research doi: 10.1613/jair.1.12590 – start-page: 67 volume-title: Proceedings of the 2018 AAAI/ACM Conference on AI, Ethics, and Society year: 2018 ident: 2023032118361820000_ article-title: Measuring and mitigating unintended bias in text classification doi: 10.1145/3278721.3278729 – volume: 58 start-page: 247 issue: 2 year: 2002 ident: 2023032118361820000_ article-title: Hate speech and constitutional protection: Priming values of equality and freedom publication-title: Journal of Social Issues doi: 10.1111/1540-4560.00259 – start-page: 191 volume-title: Proceedings of the fourth workshop on online abuse and harms year: 2020 ident: 2023032118361820000_ article-title: Investigating annotator bias with a graph-based approach doi: 10.18653/v1/2020.alw-1.22 – start-page: 1100 volume-title: Companion Proceedings of The 2019 World Wide Web Conference year: 2019 ident: 2023032118361820000_ article-title: Crowdsourcing subjective tasks: The case study of understanding toxicity in online discussions doi: 10.1145/3308560.3317083 – start-page: 259 volume-title: proceedings of the 21th ACM SIGKDD international conference on knowledge discovery and data mining year: 2015 ident: 2023032118361820000_ article-title: Certifying and removing disparate impact doi: 10.1145/2783258.2783311 – start-page: 3315 volume-title: Advances in Neural Information Processing Systems year: 2016 ident: 2023032118361820000_ article-title: Equality of opportunity in supervised learning – volume: 15 start-page: e0237861 issue: 8 year: 2020 ident: 2023032118361820000_ article-title: Hate speech detection and racial bias mitigation in social media based on bert model publication-title: PloS ONE doi: 10.1371/journal.pone.0237861 – volume: 40 start-page: 61 year: 2008 ident: 2023032118361820000_ article-title: Warmth and competence as universal dimensions of social perception: The stereotype content model and the bias map publication-title: Advances in Experimental Social Psychology doi: 10.1016/S0065-2601(07)00002-0 – start-page: 1668 volume-title: Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics year: 2019 ident: 2023032118361820000_ article-title: The risk of racial bias in hate speech detection doi: 10.18653/v1/P19-1163 – start-page: 683 volume-title: Proceedings of the International AAAI Conference on Web and Social Media year: 2020 ident: 2023032118361820000_ article-title: Empirical analysis of multi-task learning for reducing identity bias in toxic comment detection – volume: 110 start-page: 675 issue: 5 year: 2016 ident: 2023032118361820000_ article-title: The abc of stereotypes about groups: Agency/socioeconomic success, conservative–progressive beliefs, and communion. publication-title: Journal of Personality and Social Psychology doi: 10.1037/pspa0000046 – start-page: 7 volume-title: Proceedings of the Eighth International Workshop on Natural Language Processing for Social Media year: 2020 ident: 2023032118361820000_ article-title: Demoting racial bias in hate speech detection doi: 10.18653/v1/2020.socialnlp-1.2 – start-page: 25 volume-title: Proceedings of the Third Workshop on Abusive Language Online year: 2019 ident: 2023032118361820000_ article-title: Racial bias in hate speech and abusive language detection datasets doi: 10.18653/v1/W19-3504 – volume: 31 start-page: 1369 issue: 10 year: 2005 ident: 2023032118361820000_ article-title: A meta-analysis on the correlation between the implicit association test and explicit self-report measures publication-title: Personality and Social Psychology Bulletin doi: 10.1177/0146167205275613 – volume: 26 start-page: 11 year: 2021 ident: 2023032118361820000_ article-title: Rasch measurement v. item response theory: Knowing when to cross the line. publication-title: Practical Assessment, Research & Evaluation – start-page: 114 volume-title: Proceedings of the 5th Workshop on Online Abuse and Harms (WOAH 2021) year: 2021 ident: 2023032118361820000_ article-title: Mitigating biases in toxic language detection through invariant rationalization doi: 10.18653/v1/2021.woah-1.12 – volume: 54 start-page: 1 issue: 6 year: 2021 ident: 2023032118361820000_ article-title: A survey on bias and fairness in machine learning publication-title: ACM Computing Surveys (CSUR) doi: 10.1145/3457607 – volume-title: Eleventh International AAAI Conference on Web and Social Media year: 2018 ident: 2023032118361820000_ article-title: Characterizing the global crowd workforce: A cross-country comparison of crowdworker demographics – volume: 82 start-page: 878 issue: 6 year: 2002 ident: 2023032118361820000_ article-title: A model of (often mixed) stereotype content: Competence and warmth respectively follow from perceived status and competition. publication-title: Journal of Personality and Social Psychology doi: 10.1037/0022-3514.82.6.878 – start-page: 1218 volume-title: Proceedings of the 20th Chinese National Conference on Computational Linguistics year: 2021 ident: 2023032118361820000_ article-title: A robustly optimized BERT pre-training approach with post-training – volume: 32 start-page: 218 year: 2021 ident: 2023032118361820000_ article-title: Gender stereotypes in natural language: Word embeddings show robust consistency across child and adult language corpora of more than 65 million words publication-title: Psychological Science doi: 10.1177/0956797620963619 – year: 2021 ident: 2023032118361820000_ article-title: Whose opinions matter? Perspective-aware models to identify opinions of hate speech victims in abusive language detection publication-title: arXiv preprint arXiv:2106.15896 – start-page: 4349 volume-title: Advances in Neural Information Processing Systems year: 2016 ident: 2023032118361820000_ article-title: Man is to computer programmer as woman is to homemaker? Debiasing word embeddings – start-page: 5740 volume-title: Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing (EMNLP-IJCNLP) year: 2019 ident: 2023032118361820000_ article-title: Perturbation sensitivity analysis to detect unintended model biases doi: 10.18653/v1/D19-1578 – start-page: 305 volume-title: Proceedings of the 2019 AAAI/ACM Conference on AI, Ethics, and Society year: 2019 ident: 2023032118361820000_ article-title: What are the biases in my word embedding? doi: 10.1145/3306618.3314270 – start-page: 38 volume-title: Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing: System Demonstrations year: 2020 ident: 2023032118361820000_ article-title: Transformers: State-of-the-art natural language processing doi: 10.18653/v1/2020.emnlp-demos.6 – start-page: 219 volume-title: Proceedings of the 2019 AAAI/ACM Conference on AI, Ethics, and Society year: 2019 ident: 2023032118361820000_ article-title: Counterfactual fairness in text classification through robustness doi: 10.1145/3306618.3317950 – volume: 9 start-page: 269 issue: 6 year: 2015 ident: 2023032118361820000_ article-title: Group-based differences in perceptions of racism: What counts, to whom, and why? publication-title: Social and Personality Psychology Compass doi: 10.1111/spc3.12181 – volume: 356 start-page: 183 issue: 6334 year: 2017 ident: 2023032118361820000_ article-title: Semantics derived automatically from language corpora contain human-like biases publication-title: Science doi: 10.1126/science.aal4230 – volume: 76 start-page: 378 issue: 5 year: 1971 ident: 2023032118361820000_ article-title: Measuring nominal scale agreement among many raters. publication-title: Psychological bulletin doi: 10.1037/h0031619 – year: 2021 ident: 2023032118361820000_ article-title: Disembodied machine learning: On the illusion of objectivity in NLP – start-page: 133 volume-title: Proceedings of The Joint 15th Linguistic Annotation Workshop (LAW) and 3rd Designing Meaning Representations (DMR) Workshop year: 2021 ident: 2023032118361820000_ article-title: On releasing annotator-level labels and information in datasets doi: 10.18653/v1/2021.law-1.14 – volume: 22 start-page: 556 issue: 3 year: 2012 ident: 2023032118361820000_ article-title: Foreign wars and domestic prejudice: How media exposure to the Israeli-Palestinian conflict predicts ethnic stereotyping by Jewish and Arab American adolescents publication-title: Journal of Research on Adolescence doi: 10.1111/j.1532-7795.2012.00785.x – start-page: 5454 volume-title: Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics year: 2020 ident: 2023032118361820000_ article-title: Language (technology) is power: A critical survey of “bias” in NLP doi: 10.18653/v1/2020.acl-main.485 – volume: 27 start-page: 2024 issue: 12 year: 2020 ident: 2023032118361820000_ article-title: Patient safety and quality improvement: Ethical principles for a regulatory approach to bias in healthcare machine learning publication-title: Journal of the American Medical Informatics Association doi: 10.1093/jamia/ocaa085 – volume-title: Probabilistic Models for Some Intelligence and Attainment Tests year: 1993 ident: 2023032118361820000_ – start-page: 491 volume-title: Companion Proceedings of the 2019 World Wide Web Conference year: 2019 ident: 2023032118361820000_ article-title: Nuanced metrics for measuring unintended bias with real data for text classification doi: 10.1145/3308560.3317593 – volume-title: Eleventh international aaai conference on web and social media year: 2017 ident: 2023032118361820000_ article-title: Automated hate speech detection and the problem of offensive language doi: 10.1609/icwsm.v11i1.14955 – start-page: 149 volume-title: IEEE/WIC/ACM International Conference on Web Intelligence year: 2019 ident: 2023032118361820000_ article-title: Unintended bias in misogyny detection doi: 10.1145/3350546.3352512 – start-page: 92 volume-title: Proceedings of the 5th Workshop on Online Abuse and Harms (WOAH 2021) year: 2021 ident: 2023032118361820000_ article-title: Improving counterfactual generation for fair hate speech detection doi: 10.18653/v1/2021.woah-1.10 – start-page: 138 volume-title: Proceedings of the first workshop on NLP and computational social science year: 2016 ident: 2023032118361820000_ article-title: Are you a racist or am I seeing things? Annotator influence on hate speech detection on twitter – volume-title: Fairness, Accountability, and Transparency in Machine Learning (FAT/ML) Workshop, KDD year: 2017 ident: 2023032118361820000_ article-title: Racial disparity in natural language processing: A case study of social media African American English – volume-title: Proceedings of the AAAI Conference on Artificial Intelligence year: 2013 ident: 2023032118361820000_ article-title: Locate the hate: Detecting tweets against blacks doi: 10.1609/aaai.v27i1.8539 – volume: 20 start-page: 273 issue: 3 year: 1995 ident: 2023032118361820000_ article-title: Support-vector networks publication-title: Machine Learning doi: 10.1007/BF00994018 – start-page: 937 volume-title: Proceedings of the 26th International Conference on World Wide Web year: 2017 ident: 2023032118361820000_ article-title: Clustered model adaption for personalized sentiment analysis doi: 10.1145/3038912.3052693 – volume: 6 start-page: 215 issue: 3 year: 2011 ident: 2023032118361820000_ article-title: Whites see racism as a zero-sum game that they are now losing publication-title: Perspectives on Psychological Science doi: 10.1177/1745691611406922 – start-page: 610 volume-title: Proceedings of the 2021 ACM Conference on Fairness, Accountability, and Transparency year: 2021 ident: 2023032118361820000_ article-title: On the dangers of stochastic parrots: Can language models be too big? doi: 10.1145/3442188.3445922 – volume: 9 start-page: 1249 year: 2021 ident: 2023032118361820000_ article-title: Quantifying social biases in NLP: A generalization and empirical comparison of extrinsic fairness metrics publication-title: Transactions of the Association for Computational Linguistics doi: 10.1162/tacl_a_00425 – start-page: 615 volume-title: Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long and Short Papers) year: 2019 ident: 2023032118361820000_ article-title: Black is to criminal as caucasian is to police: Detecting and removing multiclass bias in word embeddings doi: 10.18653/v1/N19-1062 – year: 2018 ident: 2023032118361820000_ article-title: Pushshift gab corpus – start-page: 49 volume-title: The World Wide Web Conference year: 2019 ident: 2023032118361820000_ article-title: Stereotypical bias removal for hate speech detection task using knowledge-based generalizations doi: 10.1145/3308558.3313504 – volume: 112 start-page: 696 issue: 5 year: 2017 ident: 2023032118361820000_ article-title: Two axes of subordination: A new model of racial position. publication-title: Journal of Personality and Social Psychology doi: 10.1037/pspa0000080
SSID	ssj0001818062
Score	2.4660392
Snippet	Social stereotypes negatively impact individuals’ judgments about different groups and may have a critical role in understanding language directed toward... AbstractSocial stereotypes negatively impact individuals’ judgments about different groups and may have a critical role in understanding language directed...
SourceID	doaj proquest crossref mit
SourceType	Open Website Aggregation Database Enrichment Source Index Database Publisher
StartPage	300
SubjectTerms	Annotations Classifiers Computational linguistics English language Hate speech Machine learning Social exclusion Social factors Stereotypes
SummonAdditionalLinks	– databaseName: ProQuest Technology Collection dbid: 8FG link: http://utb.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwfV3fS8MwEA46X3wRRcXplAr6JGVtmnaNL6LiHIJ7mYO9hSRNdDC3utU9-Nd7l3WbU-ZDKTSXFi4_7vuulztCLhTnmQqp9LmW3GeWBT4Hs-cDFQFrZblWCg8KP7eTVpc99eJe6XCblGGV8z3RbdTZSKOPvA7EIOIsSZP4Jv_wsWoU_l0tS2hskq0QLA3O87T5uPSx4EFmV1MUw50xjWxvHvue0Hoh9UBIgVmoghWr5JL3g6157xd_dmhndpq7ZKfEi97tbID3yIYZ7pPrFiBEr5Mbo988V9ayb7GkteeypXrtGQ6dgojziHsdUJ4Zobd1ckC6zYeX-5ZfFkHwNYvjwo9DLTPesIFlWUa1TNI05kmDSgAexrAgZTI2qZFhKHUaAbvRCn1L3FpYjRJW3yGpDEdDc0S8lMcygrcmcLFIMWkVoC0QBxpMNQ-r5GquBKHLDOFYqGIgHFNIqPipsiq5XEjns8wYa-TuUJ8LGcxn7R6Mxq-iXB4iBHIcaa5owDFjoQIwzSygFaCnoQyorpJzGA1Rrq_Jmg81VmSwbQr0JmINwIgC68QLdxdf_fxXz9p8lJfdlxPu-P_mE7KN78bANEprpFKMP80pIJVCnbnp-A0kXeNH priority: 102 providerName: ProQuest
Title	Hate Speech Classifiers Learn Normative Social Stereotypes
URI	https://direct.mit.edu/tacl/article/doi/10.1162/tacl_a_00550 https://www.proquest.com/docview/2893946865 https://doaj.org/article/12743c9b2097440b81b4f0812111a02c
Volume	11
hasFullText	1
inHoldings	1
isFullTextHit
isPrint
link	http://utb.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwrV1BS8MwFA46L15EUXE6RwU9SVmbpmnjzcnmEBziHOwWkjTBwdyGqzv4631Ju7kpw4uHUmhf2uS9pu_7Hsl7CF1KxjIZYuEzJZhPDAl8Bm7PByoC3sowJaXdKPzYpZ0-eRjEg5VSX3ZNWJEeuFBcIwTaFCkmccBsLjsJMIsY8GNAXEIRYGX_vgELVsiUi67YLcwUL1a6U9zIhRpxwW3OqWDNB7lU_eBZ3ob5r_-xczLtfbRXokPvtujVAdrS40N00wE86PWmWqtXzxWxHBpbwNpzuVG9boE65yDi4t9eD1SlJza2OjtC_Xbr5a7jlyUPfEXiOPfjUImMJSYwJMuwEjRNY0YTLABmaE2ClIhYp1rA2FUaAZdR0kaSmDEw9wTMtWNUGU_G-gR5KYtFBE-lcJBIEmEkYCsQB9KLFQur6HqhBK7KfOC2LMWIO15AMV9VWRVdLaWnRR6MDXJNq8-ljM1e7S6ATXlpU_6XTavoAqzBy9k02_CiZE3G3psDmYlIAoiQ26rw3J3553D6o2VtYeXv5sA7I0ZoSuPT_xjBGdq1PbCL1TCuoUr-_qHPAb3kso620_Z9He00W92n57r7bL8A2nPqIQ
linkProvider	Directory of Open Access Journals
linkToHtml	http://utb.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwtV3fb9MwED5N7QO8ICZAlI0tk9gTipo4zg9PmqYNNnV0qxDbpL4Z27Gh0mjLmhXBH8XfyJ2brBtovO0hihRfYsk---67nO8DeKOFKHXMVCiMEiF3PAoFmr0QoQhaKyeM1nRQ-HSQ9S74h2E6XIHfzVkYSqts9kS_UZcTQzHyLgKDRPCsyNK96feQWKPo72pDobFQi779-QMh22z3-D3O7zZjR4fn73phzSoQGp6mVZjGRpUid5HjZcmMyooiFVnOFFpya3lUcJXawqo4VqZIEC4YTcEa4Ryqt8qp-BJu-W1OJ1pb0D44HHz8tIzq0NFpz2JKCdZUuHbYZNtnrFspcymVpLpX0R076OkC0Lp9G1X_2ARv6I6ewpPaQw32Fyq1Cit2_Ax2euiTBmdTa83XwBNpjhyRaAe-PmswWHi-cxTxMfjgDKfLTii-O3sOFw8yQC-gNZ6M7UsICpGqBL-a4cUTzZXT6N-hOAJvZkTcgbfNIEhT1yQnaoxL6bFJxuTtIevA9o30dFGL4x65AxrPGxmqoO0fTK6-yHpByhjheGKEZpGgGoka3Xfu0D9CQByriJkObOFsyHpFz-7pKL8jQ21zBFQJz9ErlcRML_1d_hpN_3pzvZnl5etLFX_1_-ZNeNQ7Pz2RJ8eD_ho8pn4oLY6xdWhVV9f2NfpJld6olTOAzw-9Hv4A2mohIQ
openUrl	ctx_ver=Z39.88-2004&ctx_enc=info%3Aofi%2Fenc%3AUTF-8&rfr_id=info%3Asid%2Fsummon.serialssolutions.com&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Ajournal&rft.genre=article&rft.atitle=Hate+Speech+Classifiers+Learn+Normative+Social+Stereotypes&rft.jtitle=Transactions+of+the+Association+for+Computational+Linguistics&rft.au=Davani%2C+Aida+Mostafazadeh&rft.au=Atari%2C+Mohammad&rft.au=Kennedy%2C+Brendan&rft.au=Dehghani%2C+Morteza&rft.date=2023-03-22&rft.pub=MIT+Press&rft.eissn=2307-387X&rft.volume=11&rft.spage=300&rft.epage=319&rft_id=info:doi/10.1162%2Ftacl_a_00550&rft.externalDBID=n%2Fa&rft.externalDocID=tacl_a_00550.pdf
thumbnail_l	http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/lc.gif&issn=2307-387X&client=summon
thumbnail_m	http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/mc.gif&issn=2307-387X&client=summon
thumbnail_s	http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/sc.gif&issn=2307-387X&client=summon