Towards Equitable MHC Binding Predictions: Computational Strategies to Assess and Reduce Data Bias

Deep learning tools that predict peptide binding by major histocompatibility complex (MHC) proteins play an essential role in developing personalized cancer immunotherapies and vaccines. In order to ensure equitable health outcomes from their application, MHC binding prediction methods must work wel...

Full description

Saved in:
Bibliographic Details
Published inbioRxiv
Main Authors Singh, Mona, Glynn, Eric, Ghersi, Dario
Format Paper
LanguageEnglish
Published Cold Spring Harbor Cold Spring Harbor Laboratory Press 02.02.2024
Subjects
Online AccessGet full text

Cover

Loading…
More Information
Summary:Deep learning tools that predict peptide binding by major histocompatibility complex (MHC) proteins play an essential role in developing personalized cancer immunotherapies and vaccines. In order to ensure equitable health outcomes from their application, MHC binding prediction methods must work well across the vast landscape of MHC alleles. Here we show that there are alarming differences across individuals in different racial and ethnic groups in how much binding data are associated with their MHC alleles. We introduce a machine learning framework to assess the impact of this data disparity for predicting binding for any given MHC allele, and apply it to develop a state-of-the-art MHC binding prediction model that additionally provides per-allele performance estimates. We demonstrate that our MHC binding model successfully mitigates much of the data disparities observed across racial groups. To address remaining inequities, we devise an algorithmic strategy for targeted data collection. Our work lays the foundation for further development of equitable MHC binding models for use in personalized immunotherapies.Competing Interest StatementThe authors have declared no competing interest.
DOI:10.1101/2024.01.30.578103