CheXclusion: Fairness gaps in deep chest X-ray classifiers
Machine learning systems have received much attention recently for their ability to achieve expert-level performance on clinical tasks, particularly in medical imaging. Here, we examine the extent to which state-of-the-art deep learning classifiers trained to yield diagnostic labels from X-ray image...
Saved in:
Published in | Pacific Symposium on Biocomputing. Pacific Symposium on Biocomputing Vol. 26; p. 232 |
---|---|
Main Authors | , , , , |
Format | Journal Article |
Language | English |
Published |
United States
2021
|
Subjects | |
Online Access | Get more information |
ISSN | 2335-6936 |
Cover
Loading…
Abstract | Machine learning systems have received much attention recently for their ability to achieve expert-level performance on clinical tasks, particularly in medical imaging. Here, we examine the extent to which state-of-the-art deep learning classifiers trained to yield diagnostic labels from X-ray images are biased with respect to protected attributes. We train convolution neural networks to predict 14 diagnostic labels in 3 prominent public chest X-ray datasets: MIMIC-CXR, Chest-Xray8, CheXpert, as well as a multi-site aggregation of all those datasets. We evaluate the TPR disparity - the difference in true positive rates (TPR) - among different protected attributes such as patient sex, age, race, and insurance type as a proxy for socioeconomic status. We demonstrate that TPR disparities exist in the state-of-the-art classifiers in all datasets, for all clinical tasks, and all subgroups. A multi-source dataset corresponds to the smallest disparities, suggesting one way to reduce bias. We find that TPR disparities are not significantly correlated with a subgroup's proportional disease burden. As clinical models move from papers to products, we encourage clinical decision makers to carefully audit for algorithmic disparities prior to deployment. Our supplementary materials can be found at, http://www.marzyehghassemi.com/chexclusion-supp-3/. |
---|---|
AbstractList | Machine learning systems have received much attention recently for their ability to achieve expert-level performance on clinical tasks, particularly in medical imaging. Here, we examine the extent to which state-of-the-art deep learning classifiers trained to yield diagnostic labels from X-ray images are biased with respect to protected attributes. We train convolution neural networks to predict 14 diagnostic labels in 3 prominent public chest X-ray datasets: MIMIC-CXR, Chest-Xray8, CheXpert, as well as a multi-site aggregation of all those datasets. We evaluate the TPR disparity - the difference in true positive rates (TPR) - among different protected attributes such as patient sex, age, race, and insurance type as a proxy for socioeconomic status. We demonstrate that TPR disparities exist in the state-of-the-art classifiers in all datasets, for all clinical tasks, and all subgroups. A multi-source dataset corresponds to the smallest disparities, suggesting one way to reduce bias. We find that TPR disparities are not significantly correlated with a subgroup's proportional disease burden. As clinical models move from papers to products, we encourage clinical decision makers to carefully audit for algorithmic disparities prior to deployment. Our supplementary materials can be found at, http://www.marzyehghassemi.com/chexclusion-supp-3/. |
Author | McDermott, Matthew Ghassemi, Marzyeh Seyyed-Kalantari, Laleh Chen, Irene Y Liu, Guanxiong |
Author_xml | – sequence: 1 givenname: Laleh surname: Seyyed-Kalantari fullname: Seyyed-Kalantari, Laleh email: laleh@cs.toronto.edu organization: Computer Science, University of Toronto, Toronto, Ontario, Canada2Vector Institute, Toronto, Ontario, Canada Corresponding author, laleh@cs.toronto.edu – sequence: 2 givenname: Guanxiong surname: Liu fullname: Liu, Guanxiong – sequence: 3 givenname: Matthew surname: McDermott fullname: McDermott, Matthew – sequence: 4 givenname: Irene Y surname: Chen fullname: Chen, Irene Y – sequence: 5 givenname: Marzyeh surname: Ghassemi fullname: Ghassemi, Marzyeh |
BackLink | https://www.ncbi.nlm.nih.gov/pubmed/33691020$$D View this record in MEDLINE/PubMed |
BookMark | eNo1j81KAzEUhYMotta-guQFAkluk0y6k8GqUHCj0F3Jz40NTNMhaRd9e1vUzfnO6uOcB3JbDgVvyFQCKKYt6AmZt5Y9F6qz2gpzTyYAl8Iln5Jlv8NNGE4tH8qSrlyuBVuj325sNBcaEUcadtiOdMOqO9MwuIssZaztkdwlNzSc_3FGvlYvn_0bW3-8vvfPazZKIY7MKu8NCATHjQzYmWvyGBYp4sLxJK67wHINoVNdiCKkCMEnEVEGbZSckadf73jye4zbsea9q-ft_wn5A6-6RJE |
ContentType | Journal Article |
DBID | CGR CUY CVF ECM EIF NPM |
DatabaseName | Medline MEDLINE MEDLINE (Ovid) MEDLINE MEDLINE PubMed |
DatabaseTitle | MEDLINE Medline Complete MEDLINE with Full Text PubMed MEDLINE (Ovid) |
DatabaseTitleList | MEDLINE |
Database_xml | – sequence: 1 dbid: NPM name: PubMed url: https://proxy.k.utb.cz/login?url=http://www.ncbi.nlm.nih.gov/entrez/query.fcgi?db=PubMed sourceTypes: Index Database – sequence: 2 dbid: EIF name: MEDLINE url: https://proxy.k.utb.cz/login?url=https://www.webofscience.com/wos/medline/basic-search sourceTypes: Index Database |
DeliveryMethod | no_fulltext_linktorsrc |
EISSN | 2335-6936 |
ExternalDocumentID | 33691020 |
Genre | Research Support, Non-U.S. Gov't Journal Article |
GroupedDBID | CGR CUY CVF ECM EIF NPM |
ID | FETCH-LOGICAL-p211t-95bb731e3a072ce8772ce0dc4fde4a0f1158939063c858cd1cfd3cbf1de2c6752 |
IngestDate | Thu Nov 24 21:46:41 EST 2022 |
IsPeerReviewed | true |
IsScholarly | true |
Language | English |
LinkModel | OpenURL |
MergedId | FETCHMERGED-LOGICAL-p211t-95bb731e3a072ce8772ce0dc4fde4a0f1158939063c858cd1cfd3cbf1de2c6752 |
PMID | 33691020 |
ParticipantIDs | pubmed_primary_33691020 |
PublicationCentury | 2000 |
PublicationDate | 2021-00-00 |
PublicationDateYYYYMMDD | 2021-01-01 |
PublicationDate_xml | – year: 2021 text: 2021-00-00 |
PublicationDecade | 2020 |
PublicationPlace | United States |
PublicationPlace_xml | – name: United States |
PublicationTitle | Pacific Symposium on Biocomputing. Pacific Symposium on Biocomputing |
PublicationTitleAlternate | Pac Symp Biocomput |
PublicationYear | 2021 |
SSID | ssib015896917 |
Score | 2.6327758 |
Snippet | Machine learning systems have received much attention recently for their ability to achieve expert-level performance on clinical tasks, particularly in medical... |
SourceID | pubmed |
SourceType | Index Database |
StartPage | 232 |
SubjectTerms | Computational Biology Humans Machine Learning Neural Networks, Computer X-Rays |
Title | CheXclusion: Fairness gaps in deep chest X-ray classifiers |
URI | https://www.ncbi.nlm.nih.gov/pubmed/33691020 |
Volume | 26 |
hasFullText | |
inHoldings | 1 |
isFullTextHit | |
isPrint | |
link | http://utb.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwnZ3fT9swEMetsb3wMoH2g8FAfthb5KmJ7TThbTAQ-8FeBlLfkHM-s0htiEorrfvrOTtO26Ehtr1YUZxWaT7K-e76PR9j72gJRg0lihRySwGKQWEQciEdeffSkEeufe3w-bf87FJ9HunRSpAZqktm1Xv49ce6kv-hSueIq6-S_Qeyyy-lE3RMfGkkwjT-FePjHziC8fw26jNOTT0NluvatFHmim0SOmIlIzE1iwS8r1y7Osree680KvOS74uJF3HNJ_4vhKP6BkLLB99aIXn0kmWqBhcLtOKLl0zOzDQWX49xmXb-Ws9DMn5ump9048sPnsNHWiV6_UjXhnwlPuis4ye_AWdcMmKqIkvXUhUYTFompRZ52W150tvf7DcD2mU71-C1k0BPypw8m1A498jsvf2z-6kNtkGRhG-NusrnpLooadZ3YuwvuxdeBDfjYos9j_EB_9DB3mZPsHnBDtdAH_IeM_eYed1wj5kHzDxg5muYX7LL05OL4zMRm16IlmLxmSh1VQ1litIMhhlgMfTjwIJyFpUZuNTfsyzJs4RCF2BTcFZC5VKLGVD0l71iT5ubBncYLx1YVTotc6vVUBlTQZY5lSsjUamyeMNedz_1qu12NrnqH8LugzN7bHNF9i175uhVwn3yy2bVQXi0d5TpPqc |
linkProvider | National Library of Medicine |
openUrl | ctx_ver=Z39.88-2004&ctx_enc=info%3Aofi%2Fenc%3AUTF-8&rfr_id=info%3Asid%2Fsummon.serialssolutions.com&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Ajournal&rft.genre=article&rft.atitle=CheXclusion%3A+Fairness+gaps+in+deep+chest+X-ray+classifiers&rft.jtitle=Pacific+Symposium+on+Biocomputing.+Pacific+Symposium+on+Biocomputing&rft.au=Seyyed-Kalantari%2C+Laleh&rft.au=Liu%2C+Guanxiong&rft.au=McDermott%2C+Matthew&rft.au=Chen%2C+Irene+Y&rft.date=2021-01-01&rft.eissn=2335-6936&rft.volume=26&rft.spage=232&rft_id=info%3Apmid%2F33691020&rft_id=info%3Apmid%2F33691020&rft.externalDocID=33691020 |