CheXclusion: Fairness gaps in deep chest X-ray classifiers

Machine learning systems have received much attention recently for their ability to achieve expert-level performance on clinical tasks, particularly in medical imaging. Here, we examine the extent to which state-of-the-art deep learning classifiers trained to yield diagnostic labels from X-ray image...

Full description

Saved in:
Bibliographic Details
Published inPacific Symposium on Biocomputing. Pacific Symposium on Biocomputing Vol. 26; p. 232
Main Authors Seyyed-Kalantari, Laleh, Liu, Guanxiong, McDermott, Matthew, Chen, Irene Y, Ghassemi, Marzyeh
Format Journal Article
LanguageEnglish
Published United States 2021
Subjects
Online AccessGet more information
ISSN2335-6936

Cover

Loading…
Abstract Machine learning systems have received much attention recently for their ability to achieve expert-level performance on clinical tasks, particularly in medical imaging. Here, we examine the extent to which state-of-the-art deep learning classifiers trained to yield diagnostic labels from X-ray images are biased with respect to protected attributes. We train convolution neural networks to predict 14 diagnostic labels in 3 prominent public chest X-ray datasets: MIMIC-CXR, Chest-Xray8, CheXpert, as well as a multi-site aggregation of all those datasets. We evaluate the TPR disparity - the difference in true positive rates (TPR) - among different protected attributes such as patient sex, age, race, and insurance type as a proxy for socioeconomic status. We demonstrate that TPR disparities exist in the state-of-the-art classifiers in all datasets, for all clinical tasks, and all subgroups. A multi-source dataset corresponds to the smallest disparities, suggesting one way to reduce bias. We find that TPR disparities are not significantly correlated with a subgroup's proportional disease burden. As clinical models move from papers to products, we encourage clinical decision makers to carefully audit for algorithmic disparities prior to deployment. Our supplementary materials can be found at, http://www.marzyehghassemi.com/chexclusion-supp-3/.
AbstractList Machine learning systems have received much attention recently for their ability to achieve expert-level performance on clinical tasks, particularly in medical imaging. Here, we examine the extent to which state-of-the-art deep learning classifiers trained to yield diagnostic labels from X-ray images are biased with respect to protected attributes. We train convolution neural networks to predict 14 diagnostic labels in 3 prominent public chest X-ray datasets: MIMIC-CXR, Chest-Xray8, CheXpert, as well as a multi-site aggregation of all those datasets. We evaluate the TPR disparity - the difference in true positive rates (TPR) - among different protected attributes such as patient sex, age, race, and insurance type as a proxy for socioeconomic status. We demonstrate that TPR disparities exist in the state-of-the-art classifiers in all datasets, for all clinical tasks, and all subgroups. A multi-source dataset corresponds to the smallest disparities, suggesting one way to reduce bias. We find that TPR disparities are not significantly correlated with a subgroup's proportional disease burden. As clinical models move from papers to products, we encourage clinical decision makers to carefully audit for algorithmic disparities prior to deployment. Our supplementary materials can be found at, http://www.marzyehghassemi.com/chexclusion-supp-3/.
Author McDermott, Matthew
Ghassemi, Marzyeh
Seyyed-Kalantari, Laleh
Chen, Irene Y
Liu, Guanxiong
Author_xml – sequence: 1
  givenname: Laleh
  surname: Seyyed-Kalantari
  fullname: Seyyed-Kalantari, Laleh
  email: laleh@cs.toronto.edu
  organization: Computer Science, University of Toronto, Toronto, Ontario, Canada2Vector Institute, Toronto, Ontario, Canada Corresponding author, laleh@cs.toronto.edu
– sequence: 2
  givenname: Guanxiong
  surname: Liu
  fullname: Liu, Guanxiong
– sequence: 3
  givenname: Matthew
  surname: McDermott
  fullname: McDermott, Matthew
– sequence: 4
  givenname: Irene Y
  surname: Chen
  fullname: Chen, Irene Y
– sequence: 5
  givenname: Marzyeh
  surname: Ghassemi
  fullname: Ghassemi, Marzyeh
BackLink https://www.ncbi.nlm.nih.gov/pubmed/33691020$$D View this record in MEDLINE/PubMed
BookMark eNo1j81KAzEUhYMotta-guQFAkluk0y6k8GqUHCj0F3Jz40NTNMhaRd9e1vUzfnO6uOcB3JbDgVvyFQCKKYt6AmZt5Y9F6qz2gpzTyYAl8Iln5Jlv8NNGE4tH8qSrlyuBVuj325sNBcaEUcadtiOdMOqO9MwuIssZaztkdwlNzSc_3FGvlYvn_0bW3-8vvfPazZKIY7MKu8NCATHjQzYmWvyGBYp4sLxJK67wHINoVNdiCKkCMEnEVEGbZSckadf73jye4zbsea9q-ft_wn5A6-6RJE
ContentType Journal Article
DBID CGR
CUY
CVF
ECM
EIF
NPM
DatabaseName Medline
MEDLINE
MEDLINE (Ovid)
MEDLINE
MEDLINE
PubMed
DatabaseTitle MEDLINE
Medline Complete
MEDLINE with Full Text
PubMed
MEDLINE (Ovid)
DatabaseTitleList MEDLINE
Database_xml – sequence: 1
  dbid: NPM
  name: PubMed
  url: https://proxy.k.utb.cz/login?url=http://www.ncbi.nlm.nih.gov/entrez/query.fcgi?db=PubMed
  sourceTypes: Index Database
– sequence: 2
  dbid: EIF
  name: MEDLINE
  url: https://proxy.k.utb.cz/login?url=https://www.webofscience.com/wos/medline/basic-search
  sourceTypes: Index Database
DeliveryMethod no_fulltext_linktorsrc
EISSN 2335-6936
ExternalDocumentID 33691020
Genre Research Support, Non-U.S. Gov't
Journal Article
GroupedDBID CGR
CUY
CVF
ECM
EIF
NPM
ID FETCH-LOGICAL-p211t-95bb731e3a072ce8772ce0dc4fde4a0f1158939063c858cd1cfd3cbf1de2c6752
IngestDate Thu Nov 24 21:46:41 EST 2022
IsPeerReviewed true
IsScholarly true
Language English
LinkModel OpenURL
MergedId FETCHMERGED-LOGICAL-p211t-95bb731e3a072ce8772ce0dc4fde4a0f1158939063c858cd1cfd3cbf1de2c6752
PMID 33691020
ParticipantIDs pubmed_primary_33691020
PublicationCentury 2000
PublicationDate 2021-00-00
PublicationDateYYYYMMDD 2021-01-01
PublicationDate_xml – year: 2021
  text: 2021-00-00
PublicationDecade 2020
PublicationPlace United States
PublicationPlace_xml – name: United States
PublicationTitle Pacific Symposium on Biocomputing. Pacific Symposium on Biocomputing
PublicationTitleAlternate Pac Symp Biocomput
PublicationYear 2021
SSID ssib015896917
Score 2.6327758
Snippet Machine learning systems have received much attention recently for their ability to achieve expert-level performance on clinical tasks, particularly in medical...
SourceID pubmed
SourceType Index Database
StartPage 232
SubjectTerms Computational Biology
Humans
Machine Learning
Neural Networks, Computer
X-Rays
Title CheXclusion: Fairness gaps in deep chest X-ray classifiers
URI https://www.ncbi.nlm.nih.gov/pubmed/33691020
Volume 26
hasFullText
inHoldings 1
isFullTextHit
isPrint
link http://utb.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwnZ3fT9swEMetsb3wMoH2g8FAfthb5KmJ7TThbTAQ-8FeBlLfkHM-s0htiEorrfvrOTtO26Ehtr1YUZxWaT7K-e76PR9j72gJRg0lihRySwGKQWEQciEdeffSkEeufe3w-bf87FJ9HunRSpAZqktm1Xv49ce6kv-hSueIq6-S_Qeyyy-lE3RMfGkkwjT-FePjHziC8fw26jNOTT0NluvatFHmim0SOmIlIzE1iwS8r1y7Osree680KvOS74uJF3HNJ_4vhKP6BkLLB99aIXn0kmWqBhcLtOKLl0zOzDQWX49xmXb-Ws9DMn5ump9048sPnsNHWiV6_UjXhnwlPuis4ye_AWdcMmKqIkvXUhUYTFompRZ52W150tvf7DcD2mU71-C1k0BPypw8m1A498jsvf2z-6kNtkGRhG-NusrnpLooadZ3YuwvuxdeBDfjYos9j_EB_9DB3mZPsHnBDtdAH_IeM_eYed1wj5kHzDxg5muYX7LL05OL4zMRm16IlmLxmSh1VQ1litIMhhlgMfTjwIJyFpUZuNTfsyzJs4RCF2BTcFZC5VKLGVD0l71iT5ubBncYLx1YVTotc6vVUBlTQZY5lSsjUamyeMNedz_1qu12NrnqH8LugzN7bHNF9i175uhVwn3yy2bVQXi0d5TpPqc
linkProvider National Library of Medicine
openUrl ctx_ver=Z39.88-2004&ctx_enc=info%3Aofi%2Fenc%3AUTF-8&rfr_id=info%3Asid%2Fsummon.serialssolutions.com&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Ajournal&rft.genre=article&rft.atitle=CheXclusion%3A+Fairness+gaps+in+deep+chest+X-ray+classifiers&rft.jtitle=Pacific+Symposium+on+Biocomputing.+Pacific+Symposium+on+Biocomputing&rft.au=Seyyed-Kalantari%2C+Laleh&rft.au=Liu%2C+Guanxiong&rft.au=McDermott%2C+Matthew&rft.au=Chen%2C+Irene+Y&rft.date=2021-01-01&rft.eissn=2335-6936&rft.volume=26&rft.spage=232&rft_id=info%3Apmid%2F33691020&rft_id=info%3Apmid%2F33691020&rft.externalDocID=33691020