VotePLMs-AFP: Identification of antifreeze proteins using transformer-embedding features and ensemble learning
Antifreeze proteins (AFPs) are a unique class of biomolecules capable of protecting other proteins, cell membranes, and cellular structures within organisms from damage caused by freezing conditions. Given the significance of AFPs in various domains such as biotechnology, agriculture, and medicine,...
Saved in:
Published in | Biochimica et biophysica acta. General subjects Vol. 1868; no. 12; p. 130721 |
---|---|
Main Authors | , |
Format | Journal Article |
Language | English |
Published |
Netherlands
Elsevier B.V
01.12.2024
|
Subjects | |
Online Access | Get full text |
Cover
Loading…
Summary: | Antifreeze proteins (AFPs) are a unique class of biomolecules capable of protecting other proteins, cell membranes, and cellular structures within organisms from damage caused by freezing conditions. Given the significance of AFPs in various domains such as biotechnology, agriculture, and medicine, several machine learning methods have been developed to identify AFPs. However, due to the complexity and diversity of AFPs, the predictive performance of existing methods is limited. Therefore, there is an urgent need to develop an efficient and rapid computational method for accurately predicting AFPs. In this study, we proposed a novel predictor based on transformer-embedding features and ensemble learning for the identification of AFPs, termed VotePLMs-AFP. Firstly, three types of feature descriptors were extracted from pre-trained protein language models (PLMs) during the feature extraction process. Subsequently, we analyzed six combinations generated by these three embeddings to explore the optimal feature set, which was input into the soft voting-based ensemble learning classifier for the identification of AFPs. Finally, we evaluated the model on the two benchmark datasets. The experimental results show that our model achieves high prediction accuracy in 10-fold cross-validation (CV) and independent set testing, outperforming existing state-of-the-art methods. Therefore, our model could serve as an effective tool for predicting AFPs.
[Display omitted]
•Integrate pre-trained PLMs into AFPs identification task.•The ensemble classifier improves the stability and robustness of the model.•Achieved new state-of-the-art performance in the identification of AFPs. |
---|---|
Bibliography: | ObjectType-Article-1 SourceType-Scholarly Journals-1 ObjectType-Feature-2 content type line 23 |
ISSN: | 0304-4165 1872-8006 1872-8006 |
DOI: | 10.1016/j.bbagen.2024.130721 |