Data Science for QSAR for Protease activity

Proteolytic cleavage is influenced by the physicochemical properties of amino acids surrounding the cleavage site. Among these properties are 553 amino acid indices, and we considered that combining these indices with machine learning could create QSAR models for protease activity. In this study, we...

Full description

Saved in:
Bibliographic Details
Published inJournal of Computer Aided Chemistry Vol. 23; pp. 43 - 49
Main Authors Ueda, Hideki, Fukumori, Akio, Koge, Daiki, Ono, Naoaki, Altaf-Ul-Amin, Md, Kanaya, Shigehiko
Format Journal Article
LanguageEnglish
Published Tokyo Division of Chemical Information and Computer Sciences The Chemical Society of Japan 2023
Japan Science and Technology Agency
Subjects
Online AccessGet full text

Cover

Loading…
More Information
Summary:Proteolytic cleavage is influenced by the physicochemical properties of amino acids surrounding the cleavage site. Among these properties are 553 amino acid indices, and we considered that combining these indices with machine learning could create QSAR models for protease activity. In this study, we focused on γ-secretase, an enzyme known to be involved in the pathogenesis of Alzheimer’s disease. We created 10,680 regression models for the protease activity of γ-secretase by using 10 amino acid indices compressed from the 553 amino acid indices through principal component analysis, 12 pocket models of protease binding sites, and 89 machine learning models. We used these regression models to predict cleavage sites for 23 substrates where the cleavage sites were known and examined the amino acid property information used in the model with the highest prediction accuracy (87.0%). We found that the amino acid property information used in this model was related to the secondary structure of proteins, which may imply that it contains important information on the transmembrane cleavage of γ-secretase.
ISSN:1345-8647
1345-8647
DOI:10.2751/jcac.23.43