Exploring diversity through machine learning: a case for the use of decision trees in social science research

The literature provides multiple measures of diversity along a single demographic dimension, but when it comes to studying the interaction of multiple diversity types (e.g. age, gender, and race), the field of useable measures diminishes. We present the use of decision trees as a machine learning te...

Full description

Saved in:
Bibliographic Details
Published inInternational journal of social research methodology Vol. 25; no. 6; pp. 725 - 740
Main Authors Srour, F. Jordan, Karkoulian, Silva
Format Journal Article
LanguageEnglish
Published Abingdon Routledge 02.11.2022
Taylor & Francis Ltd
Subjects
Online AccessGet full text

Cover

Loading…
More Information
Summary:The literature provides multiple measures of diversity along a single demographic dimension, but when it comes to studying the interaction of multiple diversity types (e.g. age, gender, and race), the field of useable measures diminishes. We present the use of decision trees as a machine learning technique to automatically identify the interactions across diversity types to predict different levels of a dependent variable. In order to demonstrate the power of decision trees, we use five types of surface-level diversity (age, gender, education level, religion, and region of origin) measured via the standardized Blau index as independent variables and knowledge sharing as the dependent variable. The results of our decision tree approach relative to linear regression show that decision trees serve as a powerful tool to identify key demographic faultlines without a priori specification of a model structure.
ISSN:1364-5579
1464-5300
DOI:10.1080/13645579.2021.1933064