Artificial Intelligence and Inclusion: Formerly Gang-Involved Youth as Domain Experts for Analyzing Unstructured Twitter Data

Mining social media data for studying the human condition has created new and unique challenges. When analyzing social media data from marginalized communities, algorithms lack the ability to accurately interpret off-line context, which may lead to dangerous assumptions about and implications for ma...

Full description

Saved in:
Bibliographic Details
Published inSocial science computer review Vol. 38; no. 1; pp. 42 - 56
Main Authors Frey, William R., Patton, Desmond U., Gaskell, Michael B., McGregor, Kyle A.
Format Journal Article
LanguageEnglish
Published Los Angeles, CA SAGE Publications 01.02.2020
SAGE PUBLICATIONS, INC
Subjects
Online AccessGet full text

Cover

Loading…
More Information
Summary:Mining social media data for studying the human condition has created new and unique challenges. When analyzing social media data from marginalized communities, algorithms lack the ability to accurately interpret off-line context, which may lead to dangerous assumptions about and implications for marginalized communities. To combat this challenge, we hired formerly gang-involved young people as domain experts for contextualizing social media data in order to create inclusive, community-informed algorithms. Utilizing data from the Gang Intervention and Computer Science Project—a comprehensive analysis of Twitter data from gang-involved youth in Chicago—we describe the process of involving formerly gang-involved young people in developing a new part-of-speech tagger and content classifier for a prototype natural language processing system that detects aggression and loss in Twitter data. We argue that involving young people as domain experts leads to more robust understandings of context, including localized language, culture, and events. These insights could change how data scientists approach the development of corpora and algorithms that affect people in marginalized communities and who to involve in that process. We offer a contextually driven interdisciplinary approach between social work and data science that integrates domain insights into the training of qualitative annotators and the production of algorithms for positive social impact.
Bibliography:ObjectType-Article-1
SourceType-Scholarly Journals-1
ObjectType-Feature-2
content type line 14
content type line 23
Authors’ Note
We would like to express our gratitude to our domain experts for providing their insights and sharing their experiences. We would also like to thank Eddie Bocanegra, Meg Helder, Owen Rambow, and Kathleen McKeown for their support and input on this study, including feedback from the anonymous reviewers. All tweets shared in this article (other than tweets from Gakirah Barnes) have been deidentified and made unsearchable. Due to the nature of the data and the vulnerable and marginalized population from which the data originates, data will only be made available through an application process and the signing of a memorandum of understanding (MOU) through the SAFElab at Columbia University (dp2787@columbia.edu).
ISSN:0894-4393
1552-8286
DOI:10.1177/0894439318788314