CheXclusion: Fairness gaps in deep chest X-ray classifiers

Machine learning systems have received much attention recently for their ability to achieve expert-level performance on clinical tasks, particularly in medical imaging. Here, we examine the extent to which state-of-the-art deep learning classifiers trained to yield diagnostic labels from X-ray image...

Full description

Saved in:

Bibliographic Details
Published in	Pacific Symposium on Biocomputing. Pacific Symposium on Biocomputing Vol. 26; p. 232
Main Authors	Seyyed-Kalantari, Laleh, Liu, Guanxiong, McDermott, Matthew, Chen, Irene Y, Ghassemi, Marzyeh
Format	Journal Article
Language	English
Published	United States 2021
Subjects	Computational Biology Humans Machine Learning Neural Networks, Computer X-Rays
Online Access	Get more information
ISSN	2335-6936

Cover

Loading…

Abstract	Machine learning systems have received much attention recently for their ability to achieve expert-level performance on clinical tasks, particularly in medical imaging. Here, we examine the extent to which state-of-the-art deep learning classifiers trained to yield diagnostic labels from X-ray images are biased with respect to protected attributes. We train convolution neural networks to predict 14 diagnostic labels in 3 prominent public chest X-ray datasets: MIMIC-CXR, Chest-Xray8, CheXpert, as well as a multi-site aggregation of all those datasets. We evaluate the TPR disparity - the difference in true positive rates (TPR) - among different protected attributes such as patient sex, age, race, and insurance type as a proxy for socioeconomic status. We demonstrate that TPR disparities exist in the state-of-the-art classifiers in all datasets, for all clinical tasks, and all subgroups. A multi-source dataset corresponds to the smallest disparities, suggesting one way to reduce bias. We find that TPR disparities are not significantly correlated with a subgroup's proportional disease burden. As clinical models move from papers to products, we encourage clinical decision makers to carefully audit for algorithmic disparities prior to deployment. Our supplementary materials can be found at, http://www.marzyehghassemi.com/chexclusion-supp-3/.
AbstractList	Machine learning systems have received much attention recently for their ability to achieve expert-level performance on clinical tasks, particularly in medical imaging. Here, we examine the extent to which state-of-the-art deep learning classifiers trained to yield diagnostic labels from X-ray images are biased with respect to protected attributes. We train convolution neural networks to predict 14 diagnostic labels in 3 prominent public chest X-ray datasets: MIMIC-CXR, Chest-Xray8, CheXpert, as well as a multi-site aggregation of all those datasets. We evaluate the TPR disparity - the difference in true positive rates (TPR) - among different protected attributes such as patient sex, age, race, and insurance type as a proxy for socioeconomic status. We demonstrate that TPR disparities exist in the state-of-the-art classifiers in all datasets, for all clinical tasks, and all subgroups. A multi-source dataset corresponds to the smallest disparities, suggesting one way to reduce bias. We find that TPR disparities are not significantly correlated with a subgroup's proportional disease burden. As clinical models move from papers to products, we encourage clinical decision makers to carefully audit for algorithmic disparities prior to deployment. Our supplementary materials can be found at, http://www.marzyehghassemi.com/chexclusion-supp-3/.
Author	McDermott, Matthew Ghassemi, Marzyeh Seyyed-Kalantari, Laleh Chen, Irene Y Liu, Guanxiong
Author_xml	– sequence: 1 givenname: Laleh surname: Seyyed-Kalantari fullname: Seyyed-Kalantari, Laleh email: laleh@cs.toronto.edu organization: Computer Science, University of Toronto, Toronto, Ontario, Canada2Vector Institute, Toronto, Ontario, Canada Corresponding author, laleh@cs.toronto.edu – sequence: 2 givenname: Guanxiong surname: Liu fullname: Liu, Guanxiong – sequence: 3 givenname: Matthew surname: McDermott fullname: McDermott, Matthew – sequence: 4 givenname: Irene Y surname: Chen fullname: Chen, Irene Y – sequence: 5 givenname: Marzyeh surname: Ghassemi fullname: Ghassemi, Marzyeh
BackLink	https://www.ncbi.nlm.nih.gov/pubmed/33691020$$D View this record in MEDLINE/PubMed
BookMark	eNo1j81KAzEUhYMotta-guQFAkluk0y6k8GqUHCj0F3Jz40NTNMhaRd9e1vUzfnO6uOcB3JbDgVvyFQCKKYt6AmZt5Y9F6qz2gpzTyYAl8Iln5Jlv8NNGE4tH8qSrlyuBVuj325sNBcaEUcadtiOdMOqO9MwuIssZaztkdwlNzSc_3FGvlYvn_0bW3-8vvfPazZKIY7MKu8NCATHjQzYmWvyGBYp4sLxJK67wHINoVNdiCKkCMEnEVEGbZSckadf73jye4zbsea9q-ft_wn5A6-6RJE
ContentType	Journal Article
DBID	CGR CUY CVF ECM EIF NPM
DatabaseName	Medline MEDLINE MEDLINE (Ovid) MEDLINE MEDLINE PubMed
DatabaseTitle	MEDLINE Medline Complete MEDLINE with Full Text PubMed MEDLINE (Ovid)
DatabaseTitleList	MEDLINE
Database_xml	– sequence: 1 dbid: NPM name: PubMed url: https://proxy.k.utb.cz/login?url=http://www.ncbi.nlm.nih.gov/entrez/query.fcgi?db=PubMed sourceTypes: Index Database – sequence: 2 dbid: EIF name: MEDLINE url: https://proxy.k.utb.cz/login?url=https://www.webofscience.com/wos/medline/basic-search sourceTypes: Index Database
DeliveryMethod	no_fulltext_linktorsrc
EISSN	2335-6936
ExternalDocumentID	33691020
Genre	Research Support, Non-U.S. Gov't Journal Article
GroupedDBID	CGR CUY CVF ECM EIF NPM
ID	FETCH-LOGICAL-p211t-95bb731e3a072ce8772ce0dc4fde4a0f1158939063c858cd1cfd3cbf1de2c6752
IngestDate	Thu Nov 24 21:46:41 EST 2022
IsPeerReviewed	true
IsScholarly	true
Language	English
LinkModel	OpenURL
MergedId	FETCHMERGED-LOGICAL-p211t-95bb731e3a072ce8772ce0dc4fde4a0f1158939063c858cd1cfd3cbf1de2c6752
PMID	33691020
ParticipantIDs	pubmed_primary_33691020
PublicationCentury	2000
PublicationDate	2021-00-00
PublicationDateYYYYMMDD	2021-01-01
PublicationDate_xml	– year: 2021 text: 2021-00-00
PublicationDecade	2020
PublicationPlace	United States
PublicationPlace_xml	– name: United States
PublicationTitle	Pacific Symposium on Biocomputing. Pacific Symposium on Biocomputing
PublicationTitleAlternate	Pac Symp Biocomput
PublicationYear	2021
SSID	ssib015896917
Score	2.6327758
Snippet	Machine learning systems have received much attention recently for their ability to achieve expert-level performance on clinical tasks, particularly in medical...
SourceID	pubmed
SourceType	Index Database
StartPage	232
SubjectTerms	Computational Biology Humans Machine Learning Neural Networks, Computer X-Rays
Title	CheXclusion: Fairness gaps in deep chest X-ray classifiers
URI	https://www.ncbi.nlm.nih.gov/pubmed/33691020
Volume	26
hasFullText
inHoldings	1
isFullTextHit
isPrint
link	http://utb.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwnZ3fT9swEMetsb3wMoH2g8FAfthb5KmJ7TThbTAQ-8FeBlLfkHM-s0htiEorrfvrOTtO26Ehtr1YUZxWaT7K-e76PR9j72gJRg0lihRySwGKQWEQciEdeffSkEeufe3w-bf87FJ9HunRSpAZqktm1Xv49ce6kv-hSueIq6-S_Qeyyy-lE3RMfGkkwjT-FePjHziC8fw26jNOTT0NluvatFHmim0SOmIlIzE1iwS8r1y7Osree680KvOS74uJF3HNJ_4vhKP6BkLLB99aIXn0kmWqBhcLtOKLl0zOzDQWX49xmXb-Ws9DMn5ump9048sPnsNHWiV6_UjXhnwlPuis4ye_AWdcMmKqIkvXUhUYTFompRZ52W150tvf7DcD2mU71-C1k0BPypw8m1A498jsvf2z-6kNtkGRhG-NusrnpLooadZ3YuwvuxdeBDfjYos9j_EB_9DB3mZPsHnBDtdAH_IeM_eYed1wj5kHzDxg5muYX7LL05OL4zMRm16IlmLxmSh1VQ1litIMhhlgMfTjwIJyFpUZuNTfsyzJs4RCF2BTcFZC5VKLGVD0l71iT5ubBncYLx1YVTotc6vVUBlTQZY5lSsjUamyeMNedz_1qu12NrnqH8LugzN7bHNF9i175uhVwn3yy2bVQXi0d5TpPqc
linkProvider	National Library of Medicine
openUrl	ctx_ver=Z39.88-2004&ctx_enc=info%3Aofi%2Fenc%3AUTF-8&rfr_id=info%3Asid%2Fsummon.serialssolutions.com&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Ajournal&rft.genre=article&rft.atitle=CheXclusion%3A+Fairness+gaps+in+deep+chest+X-ray+classifiers&rft.jtitle=Pacific+Symposium+on+Biocomputing.+Pacific+Symposium+on+Biocomputing&rft.au=Seyyed-Kalantari%2C+Laleh&rft.au=Liu%2C+Guanxiong&rft.au=McDermott%2C+Matthew&rft.au=Chen%2C+Irene+Y&rft.date=2021-01-01&rft.eissn=2335-6936&rft.volume=26&rft.spage=232&rft_id=info%3Apmid%2F33691020&rft_id=info%3Apmid%2F33691020&rft.externalDocID=33691020