Advancing oral delivery of biologics: Machine learning predicts peptide stability in the gastrointestinal tract

[Display omitted] •109 therapeutic peptide stabilities in SGF/SIF were extracted from the literature.•Various machine learning techniques were utilised to predict the stability of untested peptides.•The best models were K-nearest neighbors (SGF, F1: 84.5 %) and XGBoost (SIF, F1: 73.4 %).•Peptide lip...

Full description

Saved in:
Bibliographic Details
Published inInternational journal of pharmaceutics Vol. 634; p. 122643
Main Authors Wang, Fanjin, Sangfuang, Nannapat, McCoubrey, Laura E., Yadav, Vipul, Elbadawi, Moe, Orlu, Mine, Gaisford, Simon, Basit, Abdul W.
Format Journal Article
LanguageEnglish
Published Netherlands Elsevier B.V 05.03.2023
Subjects
Online AccessGet full text

Cover

Loading…
More Information
Summary:[Display omitted] •109 therapeutic peptide stabilities in SGF/SIF were extracted from the literature.•Various machine learning techniques were utilised to predict the stability of untested peptides.•The best models were K-nearest neighbors (SGF, F1: 84.5 %) and XGBoost (SIF, F1: 73.4 %).•Peptide lipophilicity, rigidity, and size were the most important determinants of stability. The oral delivery of peptide therapeutics could facilitate precision treatment of numerous gastrointestinal (GI) and systemic diseases with simple administration for patients. However, the vast majority of licensed peptide drugs are currently administered parenterally due to prohibitive peptide instability in the GI tract. As such, the development of GI-stable peptides is receiving considerable investment. This study provides researchers with the first tool to predict the GI stability of peptide therapeutics based solely on the amino acid sequence. Both unsupervised and supervised machine learning techniques were trained on literature-extracted data describing peptide stability in simulated gastric and small intestinal fluid (SGF and SIF). Based on 109 peptide incubations, classification models for SGF and SIF were developed. The best models utilized k-Nearest Neighbor (for SGF) and XGBoost (for SIF) algorithms, with accuracies of 75.1% (SGF) and 69.3% (SIF), and f1 scores of 84.5% (SGF) and 73.4% (SIF) under 5-fold cross-validation. Feature importance analysis demonstrated that peptides’ lipophilicity, rigidity, and size were key determinants of stability. These models are now available to those working on the development of oral peptide therapeutics.
Bibliography:ObjectType-Article-1
SourceType-Scholarly Journals-1
ObjectType-Feature-2
content type line 23
ISSN:0378-5173
1873-3476
DOI:10.1016/j.ijpharm.2023.122643