A large peptidome dataset improves HLA class I epitope prediction across most of the human population

Prediction of HLA epitopes is important for the development of cancer immunotherapies and vaccines. However, current prediction algorithms have limited predictive power, in part because they were not trained on high-quality epitope datasets covering a broad range of HLA alleles. To enable prediction...

Full description

Saved in:

Bibliographic Details
Published in	Nature biotechnology Vol. 38; no. 2; pp. 199 - 209
Main Authors	Sarkizova, Siranush, Klaeger, Susan, Le, Phuong M., Li, Letitia W., Oliveira, Giacomo, Keshishian, Hasmik, Hartigan, Christina R., Zhang, Wandi, Braun, David A., Ligon, Keith L., Bachireddy, Pavan, Zervantonakis, Ioannis K., Rosenbluth, Jennifer M., Ouspenskaia, Tamara, Law, Travis, Justesen, Sune, Stevens, Jonathan, Lane, William J., Eisenhaure, Thomas, Lan Zhang, Guang, Clauser, Karl R., Hacohen, Nir, Carr, Steven A., Wu, Catherine J., Keskin, Derin B.
Format	Journal Article
Language	English
Published	New York Nature Publishing Group US 01.02.2020 Nature Publishing Group
Subjects	631/114/2397 631/250/21 631/250/580 631/45/611 692/308/575 Agriculture Algorithms Alleles Amino Acid Motifs Analysis Antigenic determinants Bioinformatics Biomedical and Life Sciences Biomedical Engineering/Biotechnology Biomedicine Biotechnology Cancer immunotherapy Cancer vaccines Cell Line Databases, Protein Datasets Epitopes Epitopes - metabolism Genetic Loci Histocompatibility antigen HLA Histocompatibility antigens Histocompatibility Antigens Class I - metabolism HLA histocompatibility antigens Human populations Humans Immunotherapy Life Sciences Ligands Mass spectrometry Mass spectroscopy Peptide Hydrolases - metabolism Peptides Peptides - chemistry Peptides - metabolism Prediction models Predictions Proteasome Endopeptidase Complex - metabolism Proteome - metabolism Transcription Tumor cell lines Vaccines Massachusetts
Online Access	Get full text

Cover

Loading…

More Information
Summary:	Prediction of HLA epitopes is important for the development of cancer immunotherapies and vaccines. However, current prediction algorithms have limited predictive power, in part because they were not trained on high-quality epitope datasets covering a broad range of HLA alleles. To enable prediction of endogenous HLA class I-associated peptides across a large fraction of the human population, we used mass spectrometry to profile >185,000 peptides eluted from 95 HLA-A, -B, -C and -G mono-allelic cell lines. We identified canonical peptide motifs per HLA allele, unique and shared binding submotifs across alleles and distinct motifs associated with different peptide lengths. By integrating these data with transcript abundance and peptide processing, we developed HLAthena, providing allele-and-length-specific and pan-allele-pan-length prediction models for endogenous peptide presentation. These models predicted endogenous HLA class I-associated ligands with 1.5-fold improvement in positive predictive value compared with existing tools and correctly identified >75% of HLA-bound peptides that were observed experimentally in 11 patient-derived tumor cell lines. Prediction of HLA class I epitopes is improved in accuracy and breath with peptidomes from 95 mono-allelic cell lines.
Bibliography:	ObjectType-Article-1 SourceType-Scholarly Journals-1 ObjectType-Feature-2 content type line 23 D.B.K., C.J.W., N.H. and S.C. directed the overall study design. S.S. performed computational analyses and developed predictive models. S.K., C.R.H., H.K. and K.R.C. generated the MS data and performed data analysis. D.B.K. and G.L.Z. selected the HLA alleles for analysis; D.B.K., P.M.L. and L.W.L. generated the single HLA-allele cell lines and performed data generation. D.B.K., G.O., K.L., D.B., P.M.L. and L.W.L. developed the patient-derived tumor cell lines; I.K.Z. and J.M.R. generated and provided cells from an ovarian cancer PDX model; P.B. provided CLL samples for analysis. W.Z. provided expert technical assistance. T.E. generated RNA-seq data for mono-allelic cell lines; T.O. and T.L. generated and quantified Ribo-seq data. J.S. and W.L. performed HLA typing and validation of all cell lines. S.J. performed HLA-binding validation assays. S.S., S.K., N.H., C.J.W. and D.B.K. wrote the manuscript, with contributions from all co-authors. Lead Contact: cwu@partners.org Denotes equal contribution Author Contributions
ISSN:	1087-0156 1546-1696
DOI:	10.1038/s41587-019-0322-9