VotePLMs-AFP: Identification of antifreeze proteins using transformer-embedding features and ensemble learning
Antifreeze proteins (AFPs) are a unique class of biomolecules capable of protecting other proteins, cell membranes, and cellular structures within organisms from damage caused by freezing conditions. Given the significance of AFPs in various domains such as biotechnology, agriculture, and medicine,...
Saved in:
Published in | Biochimica et biophysica acta. General subjects Vol. 1868; no. 12; p. 130721 |
---|---|
Main Authors | , |
Format | Journal Article |
Language | English |
Published |
Netherlands
Elsevier B.V
01.12.2024
|
Subjects | |
Online Access | Get full text |
Cover
Loading…
Abstract | Antifreeze proteins (AFPs) are a unique class of biomolecules capable of protecting other proteins, cell membranes, and cellular structures within organisms from damage caused by freezing conditions. Given the significance of AFPs in various domains such as biotechnology, agriculture, and medicine, several machine learning methods have been developed to identify AFPs. However, due to the complexity and diversity of AFPs, the predictive performance of existing methods is limited. Therefore, there is an urgent need to develop an efficient and rapid computational method for accurately predicting AFPs. In this study, we proposed a novel predictor based on transformer-embedding features and ensemble learning for the identification of AFPs, termed VotePLMs-AFP. Firstly, three types of feature descriptors were extracted from pre-trained protein language models (PLMs) during the feature extraction process. Subsequently, we analyzed six combinations generated by these three embeddings to explore the optimal feature set, which was input into the soft voting-based ensemble learning classifier for the identification of AFPs. Finally, we evaluated the model on the two benchmark datasets. The experimental results show that our model achieves high prediction accuracy in 10-fold cross-validation (CV) and independent set testing, outperforming existing state-of-the-art methods. Therefore, our model could serve as an effective tool for predicting AFPs.
[Display omitted]
•Integrate pre-trained PLMs into AFPs identification task.•The ensemble classifier improves the stability and robustness of the model.•Achieved new state-of-the-art performance in the identification of AFPs. |
---|---|
AbstractList | Antifreeze proteins (AFPs) are a unique class of biomolecules capable of protecting other proteins, cell membranes, and cellular structures within organisms from damage caused by freezing conditions. Given the significance of AFPs in various domains such as biotechnology, agriculture, and medicine, several machine learning methods have been developed to identify AFPs. However, due to the complexity and diversity of AFPs, the predictive performance of existing methods is limited. Therefore, there is an urgent need to develop an efficient and rapid computational method for accurately predicting AFPs. In this study, we proposed a novel predictor based on transformer-embedding features and ensemble learning for the identification of AFPs, termed VotePLMs-AFP. Firstly, three types of feature descriptors were extracted from pre-trained protein language models (PLMs) during the feature extraction process. Subsequently, we analyzed six combinations generated by these three embeddings to explore the optimal feature set, which was input into the soft voting-based ensemble learning classifier for the identification of AFPs. Finally, we evaluated the model on the two benchmark datasets. The experimental results show that our model achieves high prediction accuracy in 10-fold cross-validation (CV) and independent set testing, outperforming existing state-of-the-art methods. Therefore, our model could serve as an effective tool for predicting AFPs.Antifreeze proteins (AFPs) are a unique class of biomolecules capable of protecting other proteins, cell membranes, and cellular structures within organisms from damage caused by freezing conditions. Given the significance of AFPs in various domains such as biotechnology, agriculture, and medicine, several machine learning methods have been developed to identify AFPs. However, due to the complexity and diversity of AFPs, the predictive performance of existing methods is limited. Therefore, there is an urgent need to develop an efficient and rapid computational method for accurately predicting AFPs. In this study, we proposed a novel predictor based on transformer-embedding features and ensemble learning for the identification of AFPs, termed VotePLMs-AFP. Firstly, three types of feature descriptors were extracted from pre-trained protein language models (PLMs) during the feature extraction process. Subsequently, we analyzed six combinations generated by these three embeddings to explore the optimal feature set, which was input into the soft voting-based ensemble learning classifier for the identification of AFPs. Finally, we evaluated the model on the two benchmark datasets. The experimental results show that our model achieves high prediction accuracy in 10-fold cross-validation (CV) and independent set testing, outperforming existing state-of-the-art methods. Therefore, our model could serve as an effective tool for predicting AFPs. Antifreeze proteins (AFPs) are a unique class of biomolecules capable of protecting other proteins, cell membranes, and cellular structures within organisms from damage caused by freezing conditions. Given the significance of AFPs in various domains such as biotechnology, agriculture, and medicine, several machine learning methods have been developed to identify AFPs. However, due to the complexity and diversity of AFPs, the predictive performance of existing methods is limited. Therefore, there is an urgent need to develop an efficient and rapid computational method for accurately predicting AFPs. In this study, we proposed a novel predictor based on transformer-embedding features and ensemble learning for the identification of AFPs, termed VotePLMs-AFP. Firstly, three types of feature descriptors were extracted from pre-trained protein language models (PLMs) during the feature extraction process. Subsequently, we analyzed six combinations generated by these three embeddings to explore the optimal feature set, which was input into the soft voting-based ensemble learning classifier for the identification of AFPs. Finally, we evaluated the model on the two benchmark datasets. The experimental results show that our model achieves high prediction accuracy in 10-fold cross-validation (CV) and independent set testing, outperforming existing state-of-the-art methods. Therefore, our model could serve as an effective tool for predicting AFPs. [Display omitted] •Integrate pre-trained PLMs into AFPs identification task.•The ensemble classifier improves the stability and robustness of the model.•Achieved new state-of-the-art performance in the identification of AFPs. Antifreeze proteins (AFPs) are a unique class of biomolecules capable of protecting other proteins, cell membranes, and cellular structures within organisms from damage caused by freezing conditions. Given the significance of AFPs in various domains such as biotechnology, agriculture, and medicine, several machine learning methods have been developed to identify AFPs. However, due to the complexity and diversity of AFPs, the predictive performance of existing methods is limited. Therefore, there is an urgent need to develop an efficient and rapid computational method for accurately predicting AFPs. In this study, we proposed a novel predictor based on transformer-embedding features and ensemble learning for the identification of AFPs, termed VotePLMs-AFP. Firstly, three types of feature descriptors were extracted from pre-trained protein language models (PLMs) during the feature extraction process. Subsequently, we analyzed six combinations generated by these three embeddings to explore the optimal feature set, which was input into the soft voting-based ensemble learning classifier for the identification of AFPs. Finally, we evaluated the model on the two benchmark datasets. The experimental results show that our model achieves high prediction accuracy in 10-fold cross-validation (CV) and independent set testing, outperforming existing state-of-the-art methods. Therefore, our model could serve as an effective tool for predicting AFPs. |
ArticleNumber | 130721 |
Author | Qi, Dawei Liu, Taigang |
Author_xml | – sequence: 1 givenname: Dawei surname: Qi fullname: Qi, Dawei – sequence: 2 givenname: Taigang surname: Liu fullname: Liu, Taigang email: tgliu@shou.edu.cn |
BackLink | https://www.ncbi.nlm.nih.gov/pubmed/39426757$$D View this record in MEDLINE/PubMed |
BookMark | eNp9kctOwzAQRS1UBKXwBwhlySbFjp3EYYFUVTwqFdEFsLX8mCBXiVPsBAm-HlcBlszClmfOHY99T9DEdQ4QOid4TjAprrZzpeQbuHmGMzYnFJcZOUBTwsss5RgXEzTFFLOUkSI_RichbHGMvMqP0DGtWFaUeTlF7rXrYbN-DOnibnOdrAy43tZWy952LunqRO7PHuALkp2PrHUhGYJ1b0nvpQt151vwKbQKjNlna5D94CFEoUnAhVhpIGlAehfLp-iwlk2As599hl7ubp-XD-n66X61XKxTTQnpU1DK8ALjCrjMscozTrOCas4Z1BnLDaWsIqbWqsJSAQXDaQxWalYwI2lOZ-hy7Btnfh8g9KK1QUPTSAfdEES8hbMyLjyiFz_ooFowYudtK_2n-P2jCLAR0L4LwUP9hxAs9laIrRitEHsrxGhFlN2MMojv_LDgRdAWnAZjPehemM7-3-Abge-TnA |
Cites_doi | 10.1038/s42256-022-00457-9 10.1146/annurev.physiol.63.1.359 10.1093/nar/gkac278 10.1073/pnas.74.6.2589 10.1038/s41598-020-63259-2 10.1016/j.heliyon.2021.e07953 10.2174/1570178615666180816101653 10.3390/md15020027 10.1109/TCBB.2016.2617337 10.1016/j.jtbi.2014.04.006 10.1016/S0968-0004(01)02028-X 10.1038/s41551-018-0304-0 10.1016/j.chemolab.2022.104729 10.1093/bioinformatics/btu739 10.1093/bioinformatics/btq003 10.1016/j.jtbi.2010.10.037 10.1038/s42256-019-0138-9 10.1096/fasebj.4.8.2185972 10.1007/s00232-015-9811-z 10.1093/bib/bbad135 10.1186/s12859-024-05726-5 10.1021/acs.jcim.3c01563 10.1109/TPAMI.2021.3095381 10.1109/ACCESS.2023.3321100 10.1098/rstb.2002.1081 10.1002/(SICI)1097-0134(199707)28:3<405::AID-PROT10>3.0.CO;2-L 10.1093/bib/bbab420 10.1016/j.compbiomed.2021.105006 10.1016/j.ab.2024.115603 10.1093/nar/25.17.3389 10.1038/s41592-019-0437-4 10.1016/j.femsle.2005.02.022 10.1111/1574-6968.12345 10.1186/s12859-024-05884-6 10.3390/ijms13022196 10.1016/j.artmed.2024.102860 10.1073/pnas.2016239118 10.1371/journal.pone.0020445 |
ContentType | Journal Article |
Copyright | 2024 Copyright © 2024. Published by Elsevier B.V. |
Copyright_xml | – notice: 2024 – notice: Copyright © 2024. Published by Elsevier B.V. |
DBID | AAYXX CITATION CGR CUY CVF ECM EIF NPM 7X8 |
DOI | 10.1016/j.bbagen.2024.130721 |
DatabaseName | CrossRef Medline MEDLINE MEDLINE (Ovid) MEDLINE MEDLINE PubMed MEDLINE - Academic |
DatabaseTitle | CrossRef MEDLINE Medline Complete MEDLINE with Full Text PubMed MEDLINE (Ovid) MEDLINE - Academic |
DatabaseTitleList | MEDLINE - Academic MEDLINE |
Database_xml | – sequence: 1 dbid: NPM name: PubMed url: https://proxy.k.utb.cz/login?url=http://www.ncbi.nlm.nih.gov/entrez/query.fcgi?db=PubMed sourceTypes: Index Database – sequence: 2 dbid: EIF name: MEDLINE url: https://proxy.k.utb.cz/login?url=https://www.webofscience.com/wos/medline/basic-search sourceTypes: Index Database |
DeliveryMethod | fulltext_linktorsrc |
Discipline | Chemistry Biology |
EISSN | 1872-8006 |
ExternalDocumentID | 39426757 10_1016_j_bbagen_2024_130721 S0304416524001648 |
Genre | Research Support, Non-U.S. Gov't Journal Article |
GroupedDBID | --- --K --M .~1 0R~ 1B1 1RT 1~. 1~5 23N 3O- 4.4 457 4G. 53G 5GY 5RE 5VS 7-5 71M 8P~ 9JM AACTN AAEDT AAEDW AAHBH AAIKJ AAKOC AALRI AAOAW AAQFI AAQXK AAXKI AAXUO ABEFU ABFNM ABGSF ABMAC ABUDA ABXDB ACDAQ ACIUM ACRLP ADBBV ADEZE ADMUD ADUVX AEBSH AEHWI AEKER AFJKZ AFKWA AFTJW AFXIZ AGHFR AGRDE AGUBO AGYEJ AHHHB AIEXJ AIKHN AITUG AJOXV AKRWK ALMA_UNASSIGNED_HOLDINGS AMFUW AMRAJ ASPBG AVWKF AXJTR AZFZN BKOJK BLXMC CS3 EBS EFJIC EJD EO8 EO9 EP2 EP3 FDB FEDTE FGOYB FIRID FNPLU FYGXN G-2 G-Q GBLVA HLW HVGLF HZ~ IHE J1W KOM LX3 M41 MO0 N9A O-L O9- OAUVE OHT OZT P-8 P-9 PC. Q38 R2- ROL RPZ SBG SCC SDF SDG SDP SES SEW SPCBC SSU SSZ T5K UQL WH7 WUQ XJT XPP ~G- AATTM AAYWO AAYXX ABWVN ACRPL ACVFH ADCNI ADNMO AEIPS AEUPX AFPUW AGCQF AGQPQ AGRNS AIGII AIIUN AKBMS AKYEP ANKPU APXCP BNPGV CITATION SSH CGR CUY CVF ECM EIF NPM 7X8 |
ID | FETCH-LOGICAL-c311t-ebbd86009e8a50b5283263c884ef245d33491dfcb90abe3ed8333347c464da353 |
IEDL.DBID | .~1 |
ISSN | 0304-4165 1872-8006 |
IngestDate | Fri Jul 11 15:34:26 EDT 2025 Fri May 30 10:59:47 EDT 2025 Tue Jul 01 00:22:20 EDT 2025 Sat Nov 09 16:00:11 EST 2024 |
IsPeerReviewed | true |
IsScholarly | true |
Issue | 12 |
Keywords | Ensemble learning Protein language models Soft voting Antifreeze proteins Machine learning |
Language | English |
License | Copyright © 2024. Published by Elsevier B.V. |
LinkModel | DirectLink |
MergedId | FETCHMERGED-LOGICAL-c311t-ebbd86009e8a50b5283263c884ef245d33491dfcb90abe3ed8333347c464da353 |
Notes | ObjectType-Article-1 SourceType-Scholarly Journals-1 ObjectType-Feature-2 content type line 23 |
PMID | 39426757 |
PQID | 3118471188 |
PQPubID | 23479 |
ParticipantIDs | proquest_miscellaneous_3118471188 pubmed_primary_39426757 crossref_primary_10_1016_j_bbagen_2024_130721 elsevier_sciencedirect_doi_10_1016_j_bbagen_2024_130721 |
ProviderPackageCode | CITATION AAYXX |
PublicationCentury | 2000 |
PublicationDate | December 2024 2024-12-00 20241201 |
PublicationDateYYYYMMDD | 2024-12-01 |
PublicationDate_xml | – month: 12 year: 2024 text: December 2024 |
PublicationDecade | 2020 |
PublicationPlace | Netherlands |
PublicationPlace_xml | – name: Netherlands |
PublicationTitle | Biochimica et biophysica acta. General subjects |
PublicationTitleAlternate | Biochim Biophys Acta Gen Subj |
PublicationYear | 2024 |
Publisher | Elsevier B.V |
Publisher_xml | – name: Elsevier B.V |
References | Davies (bb0020) 2002; 357 Finn (bb0180) 2014; 42 He (bb0090) 2015; 248 Gilbert, Davies, Laybourn-Parry (bb0005) 2005; 245 Zhao, Ma, Yin (bb0085) 2012; 13 Akbar, Raza, Zou (bb0215) 2024; 25 Du (bb0195) 2023; 24 Rukh (bb0210) 2024; 25 Khan (bb0105) 2018; 15 Qi, Song, Liu (bb0125) 2024; 694 Zhang (bb0205) 2022; 23 Fletcher, Hew, Davies (bb0045) 2001; 63 Lundberg (bb0145) 2018; 2 Thumuluri (bb0120) 2022; 50 Lundberg, Lee (bb0135) 2017; 30 Rao (bb0200) 2019; 32 Raza (bb0065) 2023; 63 Ali (bb0050) 2021; 139 Sonnhammer, Eddy, Durbin (bb0150) 1997; 28 Unsal (bb0130) 2022; 4 Yu, Lu (bb0070) 2011; 6 Mondal, Pai (bb0080) 2014; 356 Khan (bb0035) 2023; 232 Huang (bb0160) 2010; 26 Elnaggar (bb0165) 2022; 44 Jia, Davies (bb0015) 2002; 27 Lundberg (bb0140) 2020; 2 Akbar (bb0060) 2024; 151 Singh (bb0025) 2014; 351 Pratiwi (bb0095) 2017; 2017 Vaswani (bb0115) 2017; 30 Altschul (bb0155) 1997; 25 Raymond, DeVries (bb0010) 1977; 74 Rives (bb0170) 2021 Rao (bb0175) 2019 Usman, Khan, Lee (bb0110) 2020; 10 Miyata (bb0220) 2021; 7 Steinegger, Mirdita, Söding (bb0190) 2019; 16 Kim (bb0040) 2017; 15 Kandaswamy (bb0075) 2011; 270 Davies, Hew (bb0030) 1990; 4 Suzek (bb0185) 2015; 31 Akbar (bb0055) 2023; 11 Akbar (bb0100) 2019; 16 Kandaswamy (10.1016/j.bbagen.2024.130721_bb0075) 2011; 270 He (10.1016/j.bbagen.2024.130721_bb0090) 2015; 248 Lundberg (10.1016/j.bbagen.2024.130721_bb0140) 2020; 2 Steinegger (10.1016/j.bbagen.2024.130721_bb0190) 2019; 16 Akbar (10.1016/j.bbagen.2024.130721_bb0060) 2024; 151 Finn (10.1016/j.bbagen.2024.130721_bb0180) 2014; 42 Qi (10.1016/j.bbagen.2024.130721_bb0125) 2024; 694 Gilbert (10.1016/j.bbagen.2024.130721_bb0005) 2005; 245 Vaswani (10.1016/j.bbagen.2024.130721_bb0115) 2017; 30 Davies (10.1016/j.bbagen.2024.130721_bb0020) 2002; 357 Altschul (10.1016/j.bbagen.2024.130721_bb0155) 1997; 25 Jia (10.1016/j.bbagen.2024.130721_bb0015) 2002; 27 Zhao (10.1016/j.bbagen.2024.130721_bb0085) 2012; 13 Rao (10.1016/j.bbagen.2024.130721_bb0175) 2019 Zhang (10.1016/j.bbagen.2024.130721_bb0205) 2022; 23 Sonnhammer (10.1016/j.bbagen.2024.130721_bb0150) 1997; 28 Rao (10.1016/j.bbagen.2024.130721_bb0200) 2019; 32 Ali (10.1016/j.bbagen.2024.130721_bb0050) 2021; 139 Thumuluri (10.1016/j.bbagen.2024.130721_bb0120) 2022; 50 Lundberg (10.1016/j.bbagen.2024.130721_bb0145) 2018; 2 Raymond (10.1016/j.bbagen.2024.130721_bb0010) 1977; 74 Pratiwi (10.1016/j.bbagen.2024.130721_bb0095) 2017; 2017 Unsal (10.1016/j.bbagen.2024.130721_bb0130) 2022; 4 Usman (10.1016/j.bbagen.2024.130721_bb0110) 2020; 10 Elnaggar (10.1016/j.bbagen.2024.130721_bb0165) 2022; 44 Khan (10.1016/j.bbagen.2024.130721_bb0105) 2018; 15 Raza (10.1016/j.bbagen.2024.130721_bb0065) 2023; 63 Akbar (10.1016/j.bbagen.2024.130721_bb0055) 2023; 11 Akbar (10.1016/j.bbagen.2024.130721_bb0100) 2019; 16 Rukh (10.1016/j.bbagen.2024.130721_bb0210) 2024; 25 Lundberg (10.1016/j.bbagen.2024.130721_bb0135) 2017; 30 Suzek (10.1016/j.bbagen.2024.130721_bb0185) 2015; 31 Akbar (10.1016/j.bbagen.2024.130721_bb0215) 2024; 25 Miyata (10.1016/j.bbagen.2024.130721_bb0220) 2021; 7 Singh (10.1016/j.bbagen.2024.130721_bb0025) 2014; 351 Davies (10.1016/j.bbagen.2024.130721_bb0030) 1990; 4 Fletcher (10.1016/j.bbagen.2024.130721_bb0045) 2001; 63 Du (10.1016/j.bbagen.2024.130721_bb0195) 2023; 24 Yu (10.1016/j.bbagen.2024.130721_bb0070) 2011; 6 Rives (10.1016/j.bbagen.2024.130721_bb0170) 2021; 118 Khan (10.1016/j.bbagen.2024.130721_bb0035) 2023; 232 Huang (10.1016/j.bbagen.2024.130721_bb0160) 2010; 26 Kim (10.1016/j.bbagen.2024.130721_bb0040) 2017; 15 Mondal (10.1016/j.bbagen.2024.130721_bb0080) 2014; 356 |
References_xml | – volume: 28 start-page: 405 year: 1997 end-page: 420 ident: bb0150 article-title: Pfam: a comprehensive database of protein domain families based on seed alignments publication-title: Proteins – volume: 248 start-page: 1005 year: 2015 end-page: 1014 ident: bb0090 article-title: TargetFreeze: identifying antifreeze proteins via a combination of weights using sequence evolutionary Information and Pseudo amino acid composition publication-title: J. Membr. Biol. – year: 2019 ident: bb0175 article-title: Evaluating Protein Transfer Learning with TAPE publication-title: 33rd Conference on Neural Information Processing Systems (NeurIPS) – volume: 25 start-page: 102 year: 2024 ident: bb0215 article-title: Deepstacked-AVPs: predicting antiviral peptides using tri-segment evolutionary profile and word embedding based multi-perspective features with deep stacking model publication-title: BMC Bioinform. – volume: 27 start-page: 101 year: 2002 end-page: 106 ident: bb0015 article-title: Antifreeze proteins: an unusual receptor-ligand interaction publication-title: Trends Biochem. Sci. – volume: 270 start-page: 56 year: 2011 end-page: 62 ident: bb0075 article-title: AFP-Pred: a random forest approach for predicting antifreeze proteins from sequence-derived properties publication-title: J. Theor. Biol. – volume: 44 start-page: 7112 year: 2022 end-page: 7127 ident: bb0165 article-title: ProtTrans: toward understanding the language of life through self-supervised Learning publication-title: IEEE Trans. Pattern Anal. Mach. Intell. – volume: 16 start-page: 294 year: 2019 end-page: 302 ident: bb0100 article-title: iAFP-gap-SMOTE: an efficient feature extraction scheme gapped dipeptide composition is coupled with an oversampling technique for identification of antifreeze proteins publication-title: Lett. Org. Chem. – volume: 25 start-page: 256 year: 2024 ident: bb0210 article-title: StackedEnC-AOP: prediction of antioxidant proteins using transform evolutionary and sequential features based multi-scale vector with stacked ensemble learning publication-title: BMC Bioinform. – volume: 6 year: 2011 ident: bb0070 article-title: Identification of antifreeze proteins and their functional residues by support vector machine and genetic algorithms based on n-peptide compositions publication-title: PLoS One – volume: 63 start-page: 6537 year: 2023 end-page: 6554 ident: bb0065 article-title: AIPs-SnTCN: predicting anti-inflammatory peptides using fastText and transformer encoder-based hybrid word embedding with self-normalized temporal convolutional networks publication-title: J. Chem. Inf. Model. – volume: 11 start-page: 137099 year: 2023 end-page: 137114 ident: bb0055 article-title: pAtbP-EnC: identifying anti-tubercular peptides using multi-feature representation and genetic algorithm based deep ensemble model publication-title: IEEE Access – volume: 24 year: 2023 ident: bb0195 article-title: UniDL4BioPep: a universal deep learning architecture for binary classification in peptide bioactivity publication-title: Brief. Bioinform. – volume: 2 start-page: 749 year: 2018 end-page: 760 ident: bb0145 article-title: Explainable machine-learning predictions for the prevention of hypoxaemia during surgery publication-title: Nat. Biomed. Eng. – volume: 50 start-page: W228 year: 2022 end-page: w234 ident: bb0120 article-title: DeepLoc 2.0: multi-label subcellular localization prediction using protein language models publication-title: Nucleic Acids Res. – volume: 63 start-page: 359 year: 2001 end-page: 390 ident: bb0045 article-title: Antifreeze proteins of teleost fishes publication-title: Annu. Rev. Physiol. – volume: 151 year: 2024 ident: bb0060 article-title: iAFPs-mv-BiTCN: predicting antifungal peptides using self-attention transformer embedding and transform evolutionary based multi-view features with bidirectional temporal convolutional networks publication-title: Artif. Intell. Med. – volume: 30 year: 2017 ident: bb0135 article-title: A unified approach to interpreting model predictions publication-title: Adv. Neural Inf. Proces. Syst. – volume: 15 year: 2017 ident: bb0040 article-title: Marine antifreeze proteins: structure, function, and application to cryopreservation as a potential Cryoprotectant publication-title: Mar. Drugs – volume: 139 year: 2021 ident: bb0050 article-title: AFP-CMBPred: computational identification of antifreeze proteins by extending consensus sequences into multi-blocks evolutionary information publication-title: Comput. Biol. Med. – volume: 26 start-page: 680 year: 2010 end-page: 682 ident: bb0160 article-title: CD-HIT suite: a web server for clustering and comparing biological sequences publication-title: Bioinformatics – year: 2021 ident: bb0170 article-title: Biological structure and function emerge from scaling unsupervised learning to 250 million protein sequences publication-title: Proc. Natl. Acad. Sci. USA – volume: 356 start-page: 30 year: 2014 end-page: 35 ident: bb0080 article-title: Chou’s pseudo amino acid composition improves sequence-based antifreeze protein prediction publication-title: J. Theor. Biol. – volume: 74 start-page: 2589 year: 1977 end-page: 2593 ident: bb0010 article-title: Adsorption inhibition as a mechanism of freezing resistance in polar fishes publication-title: Proc. Natl. Acad. Sci. USA – volume: 245 start-page: 67 year: 2005 end-page: 72 ident: bb0005 article-title: A hyperactive, Ca2+−dependent antifreeze protein in an Antarctic bacterium publication-title: FEMS Microbiol. Lett. – volume: 13 start-page: 2196 year: 2012 end-page: 2207 ident: bb0085 article-title: Using support vector machine and evolutionary profiles to predict antifreeze protein sequences publication-title: Int. J. Mol. Sci. – volume: 4 start-page: 227 year: 2022 end-page: 245 ident: bb0130 article-title: Learning functional properties of proteins with language models publication-title: Nat. Mach. Intell. – volume: 694 year: 2024 ident: bb0125 article-title: PreDBP-PLMs: prediction of DNA-binding proteins based on pre-trained protein language models and convolutional neural networks publication-title: Anal. Biochem. – volume: 2 start-page: 56 year: 2020 end-page: 67 ident: bb0140 article-title: From local explanations to global understanding with explainable AI for trees publication-title: Nat. Mach. Intell. – volume: 15 start-page: 244 year: 2018 end-page: 250 ident: bb0105 article-title: RAFP-Pred: robust prediction of antifreeze proteins using localized analysis of n-peptide compositions publication-title: IEEE/ACM Trans. Comput. Biol. Bioinform. – volume: 30 year: 2017 ident: bb0115 article-title: Attention is all you need publication-title: Adv. Neural Inf. Proces. Syst. – volume: 351 start-page: 14 year: 2014 end-page: 22 ident: bb0025 article-title: Antifreeze protein activity in Arctic cryoconite bacteria publication-title: FEMS Microbiol. Lett. – volume: 232 year: 2023 ident: bb0035 article-title: Comparative analysis of the existing methods for prediction of antifreeze proteins publication-title: Chemom. Intell. Lab. Syst. – volume: 2017 year: 2017 ident: bb0095 article-title: CryoProtect: a web server for classifying antifreeze proteins from nonantifreeze proteins publication-title: J. Chemother. – volume: 10 start-page: 7197 year: 2020 ident: bb0110 article-title: AFP-LSE: antifreeze proteins prediction using latent space encoding of composition of k-spaced amino acid pairs publication-title: Sci. Rep. – volume: 25 start-page: 3389 year: 1997 end-page: 3402 ident: bb0155 article-title: Gapped BLAST and PSI-BLAST: a new generation of protein database search programs publication-title: Nucleic Acids Res. – volume: 42 year: 2014 ident: bb0180 article-title: Pfam: the protein families database publication-title: Nucleic Acids Res. – volume: 7 year: 2021 ident: bb0220 article-title: Prediction and analysis of antifreeze proteins publication-title: Heliyon – volume: 32 year: 2019 ident: bb0200 article-title: Evaluating protein transfer learning with TAPE publication-title: Adv. Neural Inf. Proces. Syst. – volume: 31 start-page: 926 year: 2015 end-page: 932 ident: bb0185 article-title: UniRef clusters: a comprehensive and scalable alternative for improving sequence similarity searches publication-title: Bioinformatics – volume: 357 start-page: 927 year: 2002 end-page: 935 ident: bb0020 article-title: Structure and function of antifreeze proteins publication-title: Philos. Trans. R. Soc. Lond. Ser. B Biol. Sci. – volume: 4 start-page: 2460 year: 1990 end-page: 2468 ident: bb0030 article-title: Biochemistry of fish antifreeze proteins publication-title: FASEB J. – volume: 16 start-page: 603 year: 2019 end-page: 606 ident: bb0190 article-title: Protein-level assembly increases protein sequence recovery from metagenomic samples manyfold publication-title: Nat. Methods – volume: 23 year: 2022 ident: bb0205 article-title: T4SEfinder: a bioinformatics tool for genome-scale prediction of bacterial type IV secreted effectors using pre-trained protein language model publication-title: Brief. Bioinform. – volume: 4 start-page: 227 issue: 3 year: 2022 ident: 10.1016/j.bbagen.2024.130721_bb0130 article-title: Learning functional properties of proteins with language models publication-title: Nat. Mach. Intell. doi: 10.1038/s42256-022-00457-9 – volume: 32 year: 2019 ident: 10.1016/j.bbagen.2024.130721_bb0200 article-title: Evaluating protein transfer learning with TAPE publication-title: Adv. Neural Inf. Proces. Syst. – volume: 63 start-page: 359 year: 2001 ident: 10.1016/j.bbagen.2024.130721_bb0045 article-title: Antifreeze proteins of teleost fishes publication-title: Annu. Rev. Physiol. doi: 10.1146/annurev.physiol.63.1.359 – volume: 50 start-page: W228 issue: W1 year: 2022 ident: 10.1016/j.bbagen.2024.130721_bb0120 article-title: DeepLoc 2.0: multi-label subcellular localization prediction using protein language models publication-title: Nucleic Acids Res. doi: 10.1093/nar/gkac278 – volume: 74 start-page: 2589 issue: 6 year: 1977 ident: 10.1016/j.bbagen.2024.130721_bb0010 article-title: Adsorption inhibition as a mechanism of freezing resistance in polar fishes publication-title: Proc. Natl. Acad. Sci. USA doi: 10.1073/pnas.74.6.2589 – volume: 10 start-page: 7197 issue: 1 year: 2020 ident: 10.1016/j.bbagen.2024.130721_bb0110 article-title: AFP-LSE: antifreeze proteins prediction using latent space encoding of composition of k-spaced amino acid pairs publication-title: Sci. Rep. doi: 10.1038/s41598-020-63259-2 – volume: 2017 issue: 1 year: 2017 ident: 10.1016/j.bbagen.2024.130721_bb0095 article-title: CryoProtect: a web server for classifying antifreeze proteins from nonantifreeze proteins publication-title: J. Chemother. – volume: 7 issue: 9 year: 2021 ident: 10.1016/j.bbagen.2024.130721_bb0220 article-title: Prediction and analysis of antifreeze proteins publication-title: Heliyon doi: 10.1016/j.heliyon.2021.e07953 – volume: 16 start-page: 294 issue: 4 year: 2019 ident: 10.1016/j.bbagen.2024.130721_bb0100 article-title: iAFP-gap-SMOTE: an efficient feature extraction scheme gapped dipeptide composition is coupled with an oversampling technique for identification of antifreeze proteins publication-title: Lett. Org. Chem. doi: 10.2174/1570178615666180816101653 – volume: 15 issue: 2 year: 2017 ident: 10.1016/j.bbagen.2024.130721_bb0040 article-title: Marine antifreeze proteins: structure, function, and application to cryopreservation as a potential Cryoprotectant publication-title: Mar. Drugs doi: 10.3390/md15020027 – volume: 15 start-page: 244 issue: 1 year: 2018 ident: 10.1016/j.bbagen.2024.130721_bb0105 article-title: RAFP-Pred: robust prediction of antifreeze proteins using localized analysis of n-peptide compositions publication-title: IEEE/ACM Trans. Comput. Biol. Bioinform. doi: 10.1109/TCBB.2016.2617337 – volume: 356 start-page: 30 year: 2014 ident: 10.1016/j.bbagen.2024.130721_bb0080 article-title: Chou’s pseudo amino acid composition improves sequence-based antifreeze protein prediction publication-title: J. Theor. Biol. doi: 10.1016/j.jtbi.2014.04.006 – volume: 27 start-page: 101 issue: 2 year: 2002 ident: 10.1016/j.bbagen.2024.130721_bb0015 article-title: Antifreeze proteins: an unusual receptor-ligand interaction publication-title: Trends Biochem. Sci. doi: 10.1016/S0968-0004(01)02028-X – volume: 2 start-page: 749 issue: 10 year: 2018 ident: 10.1016/j.bbagen.2024.130721_bb0145 article-title: Explainable machine-learning predictions for the prevention of hypoxaemia during surgery publication-title: Nat. Biomed. Eng. doi: 10.1038/s41551-018-0304-0 – volume: 232 year: 2023 ident: 10.1016/j.bbagen.2024.130721_bb0035 article-title: Comparative analysis of the existing methods for prediction of antifreeze proteins publication-title: Chemom. Intell. Lab. Syst. doi: 10.1016/j.chemolab.2022.104729 – volume: 31 start-page: 926 issue: 6 year: 2015 ident: 10.1016/j.bbagen.2024.130721_bb0185 article-title: UniRef clusters: a comprehensive and scalable alternative for improving sequence similarity searches publication-title: Bioinformatics doi: 10.1093/bioinformatics/btu739 – volume: 26 start-page: 680 issue: 5 year: 2010 ident: 10.1016/j.bbagen.2024.130721_bb0160 article-title: CD-HIT suite: a web server for clustering and comparing biological sequences publication-title: Bioinformatics doi: 10.1093/bioinformatics/btq003 – volume: 270 start-page: 56 issue: 1 year: 2011 ident: 10.1016/j.bbagen.2024.130721_bb0075 article-title: AFP-Pred: a random forest approach for predicting antifreeze proteins from sequence-derived properties publication-title: J. Theor. Biol. doi: 10.1016/j.jtbi.2010.10.037 – volume: 2 start-page: 56 issue: 1 year: 2020 ident: 10.1016/j.bbagen.2024.130721_bb0140 article-title: From local explanations to global understanding with explainable AI for trees publication-title: Nat. Mach. Intell. doi: 10.1038/s42256-019-0138-9 – volume: 4 start-page: 2460 issue: 8 year: 1990 ident: 10.1016/j.bbagen.2024.130721_bb0030 article-title: Biochemistry of fish antifreeze proteins publication-title: FASEB J. doi: 10.1096/fasebj.4.8.2185972 – volume: 248 start-page: 1005 issue: 6 year: 2015 ident: 10.1016/j.bbagen.2024.130721_bb0090 article-title: TargetFreeze: identifying antifreeze proteins via a combination of weights using sequence evolutionary Information and Pseudo amino acid composition publication-title: J. Membr. Biol. doi: 10.1007/s00232-015-9811-z – volume: 24 issue: 3 year: 2023 ident: 10.1016/j.bbagen.2024.130721_bb0195 article-title: UniDL4BioPep: a universal deep learning architecture for binary classification in peptide bioactivity publication-title: Brief. Bioinform. doi: 10.1093/bib/bbad135 – volume: 25 start-page: 102 issue: 1 year: 2024 ident: 10.1016/j.bbagen.2024.130721_bb0215 article-title: Deepstacked-AVPs: predicting antiviral peptides using tri-segment evolutionary profile and word embedding based multi-perspective features with deep stacking model publication-title: BMC Bioinform. doi: 10.1186/s12859-024-05726-5 – volume: 30 year: 2017 ident: 10.1016/j.bbagen.2024.130721_bb0135 article-title: A unified approach to interpreting model predictions publication-title: Adv. Neural Inf. Proces. Syst. – volume: 63 start-page: 6537 issue: 21 year: 2023 ident: 10.1016/j.bbagen.2024.130721_bb0065 article-title: AIPs-SnTCN: predicting anti-inflammatory peptides using fastText and transformer encoder-based hybrid word embedding with self-normalized temporal convolutional networks publication-title: J. Chem. Inf. Model. doi: 10.1021/acs.jcim.3c01563 – volume: 44 start-page: 7112 issue: 10 year: 2022 ident: 10.1016/j.bbagen.2024.130721_bb0165 article-title: ProtTrans: toward understanding the language of life through self-supervised Learning publication-title: IEEE Trans. Pattern Anal. Mach. Intell. doi: 10.1109/TPAMI.2021.3095381 – volume: 11 start-page: 137099 year: 2023 ident: 10.1016/j.bbagen.2024.130721_bb0055 article-title: pAtbP-EnC: identifying anti-tubercular peptides using multi-feature representation and genetic algorithm based deep ensemble model publication-title: IEEE Access doi: 10.1109/ACCESS.2023.3321100 – volume: 357 start-page: 927 issue: 1423 year: 2002 ident: 10.1016/j.bbagen.2024.130721_bb0020 article-title: Structure and function of antifreeze proteins publication-title: Philos. Trans. R. Soc. Lond. Ser. B Biol. Sci. doi: 10.1098/rstb.2002.1081 – volume: 28 start-page: 405 issue: 3 year: 1997 ident: 10.1016/j.bbagen.2024.130721_bb0150 article-title: Pfam: a comprehensive database of protein domain families based on seed alignments publication-title: Proteins doi: 10.1002/(SICI)1097-0134(199707)28:3<405::AID-PROT10>3.0.CO;2-L – volume: 42 issue: Database issue year: 2014 ident: 10.1016/j.bbagen.2024.130721_bb0180 article-title: Pfam: the protein families database publication-title: Nucleic Acids Res. – volume: 23 issue: 1 year: 2022 ident: 10.1016/j.bbagen.2024.130721_bb0205 article-title: T4SEfinder: a bioinformatics tool for genome-scale prediction of bacterial type IV secreted effectors using pre-trained protein language model publication-title: Brief. Bioinform. doi: 10.1093/bib/bbab420 – volume: 139 year: 2021 ident: 10.1016/j.bbagen.2024.130721_bb0050 article-title: AFP-CMBPred: computational identification of antifreeze proteins by extending consensus sequences into multi-blocks evolutionary information publication-title: Comput. Biol. Med. doi: 10.1016/j.compbiomed.2021.105006 – volume: 694 year: 2024 ident: 10.1016/j.bbagen.2024.130721_bb0125 article-title: PreDBP-PLMs: prediction of DNA-binding proteins based on pre-trained protein language models and convolutional neural networks publication-title: Anal. Biochem. doi: 10.1016/j.ab.2024.115603 – volume: 25 start-page: 3389 issue: 17 year: 1997 ident: 10.1016/j.bbagen.2024.130721_bb0155 article-title: Gapped BLAST and PSI-BLAST: a new generation of protein database search programs publication-title: Nucleic Acids Res. doi: 10.1093/nar/25.17.3389 – year: 2019 ident: 10.1016/j.bbagen.2024.130721_bb0175 article-title: Evaluating Protein Transfer Learning with TAPE – volume: 30 year: 2017 ident: 10.1016/j.bbagen.2024.130721_bb0115 article-title: Attention is all you need publication-title: Adv. Neural Inf. Proces. Syst. – volume: 16 start-page: 603 issue: 7 year: 2019 ident: 10.1016/j.bbagen.2024.130721_bb0190 article-title: Protein-level assembly increases protein sequence recovery from metagenomic samples manyfold publication-title: Nat. Methods doi: 10.1038/s41592-019-0437-4 – volume: 245 start-page: 67 issue: 1 year: 2005 ident: 10.1016/j.bbagen.2024.130721_bb0005 article-title: A hyperactive, Ca2+−dependent antifreeze protein in an Antarctic bacterium publication-title: FEMS Microbiol. Lett. doi: 10.1016/j.femsle.2005.02.022 – volume: 351 start-page: 14 issue: 1 year: 2014 ident: 10.1016/j.bbagen.2024.130721_bb0025 article-title: Antifreeze protein activity in Arctic cryoconite bacteria publication-title: FEMS Microbiol. Lett. doi: 10.1111/1574-6968.12345 – volume: 25 start-page: 256 issue: 1 year: 2024 ident: 10.1016/j.bbagen.2024.130721_bb0210 article-title: StackedEnC-AOP: prediction of antioxidant proteins using transform evolutionary and sequential features based multi-scale vector with stacked ensemble learning publication-title: BMC Bioinform. doi: 10.1186/s12859-024-05884-6 – volume: 13 start-page: 2196 issue: 2 year: 2012 ident: 10.1016/j.bbagen.2024.130721_bb0085 article-title: Using support vector machine and evolutionary profiles to predict antifreeze protein sequences publication-title: Int. J. Mol. Sci. doi: 10.3390/ijms13022196 – volume: 151 year: 2024 ident: 10.1016/j.bbagen.2024.130721_bb0060 article-title: iAFPs-mv-BiTCN: predicting antifungal peptides using self-attention transformer embedding and transform evolutionary based multi-view features with bidirectional temporal convolutional networks publication-title: Artif. Intell. Med. doi: 10.1016/j.artmed.2024.102860 – volume: 118 issue: 15 year: 2021 ident: 10.1016/j.bbagen.2024.130721_bb0170 article-title: Biological structure and function emerge from scaling unsupervised learning to 250 million protein sequences publication-title: Proc. Natl. Acad. Sci. USA doi: 10.1073/pnas.2016239118 – volume: 6 issue: 5 year: 2011 ident: 10.1016/j.bbagen.2024.130721_bb0070 article-title: Identification of antifreeze proteins and their functional residues by support vector machine and genetic algorithms based on n-peptide compositions publication-title: PLoS One doi: 10.1371/journal.pone.0020445 |
SSID | ssj0000595 |
Score | 2.4529693 |
Snippet | Antifreeze proteins (AFPs) are a unique class of biomolecules capable of protecting other proteins, cell membranes, and cellular structures within organisms... |
SourceID | proquest pubmed crossref elsevier |
SourceType | Aggregation Database Index Database Publisher |
StartPage | 130721 |
SubjectTerms | Algorithms Antifreeze proteins Antifreeze Proteins - chemistry Computational Biology - methods Databases, Protein Ensemble learning Machine Learning Protein language models Soft voting |
Title | VotePLMs-AFP: Identification of antifreeze proteins using transformer-embedding features and ensemble learning |
URI | https://dx.doi.org/10.1016/j.bbagen.2024.130721 https://www.ncbi.nlm.nih.gov/pubmed/39426757 https://www.proquest.com/docview/3118471188 |
Volume | 1868 |
hasFullText | 1 |
inHoldings | 1 |
isFullTextHit | |
isPrint | |
link | http://utb.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwnV1LT9wwEB4hKtReUAt9LG2Rkbia3cTOi9tq1dWWl5BaEDfLjierrdos2g0HOPDbOxMnVD2gSs3NSaw4M_bMZ88L4DCuCIb4OJNl5azUtJpkXhJDfMnl4EeIccUW3fOLdHalT26Smw2Y9LEw7FbZyf4g01tp3d0ZdtQc3i4Ww29s1CM4kbAXJIF-DvjVOuNZfvT4x82D4EMSLAla8tt9-Fzr4-UcLVrOghprLoucxdFz6uk5-Nmqoelr2O7woxiHIb6BDax3YCtUlLzfgZeTvoDbLtTXywYvz87Xcjy9PBYhJLfqzujEshKW2yvEBxRtuoZFvRbsBz8XTQ9ncSXxl0PPGk5U2GYBXVNHL2j7S09-oujqTszfwtX0y_fJTHblFWSpoqiR6JzPCe8UmNtk5DjLS5wqYpTGKtaJV0oXka9KV4ysQ4U-V3TprNSp9lYl6h1s1ssaP4CwVvsktTaKHCGSuMwzW2DpaauYes44NwDZU9Xchiwapncv-2ECFwxzwQQuDCDrSW_-mg2GBP0_eh70nDJEbrZ-2BqXd2tD_8yaOMrzAbwPLHwaiyoIqGRJtvff3_0Ir7gVHF0-wWazusPPBFcat9_Ox314Mf56Orv4Dbwo6Yc |
linkProvider | Elsevier |
linkToHtml | http://utb.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwnV1LT9wwEB5RUEUvVaGvLS11pV7d3cTOixtasVpgFyEVKm6WHU_QVpBFu-HQHvrbOxMnrTigSs0tcaI489kznzPjGYDPcUU0xMeZLCtnpabZJPOSAPEll4MfIcYVe3TnZ-n0Up9cJVcbMO73wnBYZaf7g05vtXV3ZdhJc3i3WAy_slOP6ETCUZBE-vMnsKVp-nIZgy-__sZ5EH9IgitBS7693z_XBnk5R7OW06DGmusiZ3H0mH16jH-2dmjyAp53BFIchj7uwAbWu_A0lJT8sQvb476C20uovy0bPJ_N1_Jwcn4gwp7cqvtJJ5aVsHy-QvyJos3XsKjXggPhr0XT81lcSbx16NnEiQrbNKBretALWv9Syw2KrvDE9Su4nBxdjKeyq68gSxVFjUTnfE6Ep8DcJiPHaV7iVBFSGqtYJ14pXUS-Kl0xsg4V-lzRobNSp9pblajXsFkva3wLwlrtk9TaKHJESeIyz2yBpae1Yuo55dwAZC9VcxfSaJg-vuy7CSgYRsEEFAaQ9aI3D4aDIU3_jyc_9UgZEje7P2yNy_u1oW9mUxzl-QDeBAj_9EUVxFSyJHv33-_9CNvTi_nMzI7PTvfgGbeEqJf3sNms7vEDcZfG7bdj8zcgNesV |
openUrl | ctx_ver=Z39.88-2004&ctx_enc=info%3Aofi%2Fenc%3AUTF-8&rfr_id=info%3Asid%2Fsummon.serialssolutions.com&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Ajournal&rft.genre=article&rft.atitle=VotePLMs-AFP%3A+Identification+of+antifreeze+proteins+using+transformer-embedding+features+and+ensemble+learning&rft.jtitle=Biochimica+et+biophysica+acta.+General+subjects&rft.au=Qi%2C+Dawei&rft.au=Liu%2C+Taigang&rft.date=2024-12-01&rft.eissn=1872-8006&rft.volume=1868&rft.issue=12&rft.spage=130721&rft_id=info:doi/10.1016%2Fj.bbagen.2024.130721&rft_id=info%3Apmid%2F39426757&rft.externalDocID=39426757 |
thumbnail_l | http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/lc.gif&issn=0304-4165&client=summon |
thumbnail_m | http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/mc.gif&issn=0304-4165&client=summon |
thumbnail_s | http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/sc.gif&issn=0304-4165&client=summon |