VotePLMs-AFP: Identification of antifreeze proteins using transformer-embedding features and ensemble learning

Antifreeze proteins (AFPs) are a unique class of biomolecules capable of protecting other proteins, cell membranes, and cellular structures within organisms from damage caused by freezing conditions. Given the significance of AFPs in various domains such as biotechnology, agriculture, and medicine,...

Full description

Saved in:
Bibliographic Details
Published inBiochimica et biophysica acta. General subjects Vol. 1868; no. 12; p. 130721
Main Authors Qi, Dawei, Liu, Taigang
Format Journal Article
LanguageEnglish
Published Netherlands Elsevier B.V 01.12.2024
Subjects
Online AccessGet full text

Cover

Loading…
Abstract Antifreeze proteins (AFPs) are a unique class of biomolecules capable of protecting other proteins, cell membranes, and cellular structures within organisms from damage caused by freezing conditions. Given the significance of AFPs in various domains such as biotechnology, agriculture, and medicine, several machine learning methods have been developed to identify AFPs. However, due to the complexity and diversity of AFPs, the predictive performance of existing methods is limited. Therefore, there is an urgent need to develop an efficient and rapid computational method for accurately predicting AFPs. In this study, we proposed a novel predictor based on transformer-embedding features and ensemble learning for the identification of AFPs, termed VotePLMs-AFP. Firstly, three types of feature descriptors were extracted from pre-trained protein language models (PLMs) during the feature extraction process. Subsequently, we analyzed six combinations generated by these three embeddings to explore the optimal feature set, which was input into the soft voting-based ensemble learning classifier for the identification of AFPs. Finally, we evaluated the model on the two benchmark datasets. The experimental results show that our model achieves high prediction accuracy in 10-fold cross-validation (CV) and independent set testing, outperforming existing state-of-the-art methods. Therefore, our model could serve as an effective tool for predicting AFPs. [Display omitted] •Integrate pre-trained PLMs into AFPs identification task.•The ensemble classifier improves the stability and robustness of the model.•Achieved new state-of-the-art performance in the identification of AFPs.
AbstractList Antifreeze proteins (AFPs) are a unique class of biomolecules capable of protecting other proteins, cell membranes, and cellular structures within organisms from damage caused by freezing conditions. Given the significance of AFPs in various domains such as biotechnology, agriculture, and medicine, several machine learning methods have been developed to identify AFPs. However, due to the complexity and diversity of AFPs, the predictive performance of existing methods is limited. Therefore, there is an urgent need to develop an efficient and rapid computational method for accurately predicting AFPs. In this study, we proposed a novel predictor based on transformer-embedding features and ensemble learning for the identification of AFPs, termed VotePLMs-AFP. Firstly, three types of feature descriptors were extracted from pre-trained protein language models (PLMs) during the feature extraction process. Subsequently, we analyzed six combinations generated by these three embeddings to explore the optimal feature set, which was input into the soft voting-based ensemble learning classifier for the identification of AFPs. Finally, we evaluated the model on the two benchmark datasets. The experimental results show that our model achieves high prediction accuracy in 10-fold cross-validation (CV) and independent set testing, outperforming existing state-of-the-art methods. Therefore, our model could serve as an effective tool for predicting AFPs.Antifreeze proteins (AFPs) are a unique class of biomolecules capable of protecting other proteins, cell membranes, and cellular structures within organisms from damage caused by freezing conditions. Given the significance of AFPs in various domains such as biotechnology, agriculture, and medicine, several machine learning methods have been developed to identify AFPs. However, due to the complexity and diversity of AFPs, the predictive performance of existing methods is limited. Therefore, there is an urgent need to develop an efficient and rapid computational method for accurately predicting AFPs. In this study, we proposed a novel predictor based on transformer-embedding features and ensemble learning for the identification of AFPs, termed VotePLMs-AFP. Firstly, three types of feature descriptors were extracted from pre-trained protein language models (PLMs) during the feature extraction process. Subsequently, we analyzed six combinations generated by these three embeddings to explore the optimal feature set, which was input into the soft voting-based ensemble learning classifier for the identification of AFPs. Finally, we evaluated the model on the two benchmark datasets. The experimental results show that our model achieves high prediction accuracy in 10-fold cross-validation (CV) and independent set testing, outperforming existing state-of-the-art methods. Therefore, our model could serve as an effective tool for predicting AFPs.
Antifreeze proteins (AFPs) are a unique class of biomolecules capable of protecting other proteins, cell membranes, and cellular structures within organisms from damage caused by freezing conditions. Given the significance of AFPs in various domains such as biotechnology, agriculture, and medicine, several machine learning methods have been developed to identify AFPs. However, due to the complexity and diversity of AFPs, the predictive performance of existing methods is limited. Therefore, there is an urgent need to develop an efficient and rapid computational method for accurately predicting AFPs. In this study, we proposed a novel predictor based on transformer-embedding features and ensemble learning for the identification of AFPs, termed VotePLMs-AFP. Firstly, three types of feature descriptors were extracted from pre-trained protein language models (PLMs) during the feature extraction process. Subsequently, we analyzed six combinations generated by these three embeddings to explore the optimal feature set, which was input into the soft voting-based ensemble learning classifier for the identification of AFPs. Finally, we evaluated the model on the two benchmark datasets. The experimental results show that our model achieves high prediction accuracy in 10-fold cross-validation (CV) and independent set testing, outperforming existing state-of-the-art methods. Therefore, our model could serve as an effective tool for predicting AFPs. [Display omitted] •Integrate pre-trained PLMs into AFPs identification task.•The ensemble classifier improves the stability and robustness of the model.•Achieved new state-of-the-art performance in the identification of AFPs.
Antifreeze proteins (AFPs) are a unique class of biomolecules capable of protecting other proteins, cell membranes, and cellular structures within organisms from damage caused by freezing conditions. Given the significance of AFPs in various domains such as biotechnology, agriculture, and medicine, several machine learning methods have been developed to identify AFPs. However, due to the complexity and diversity of AFPs, the predictive performance of existing methods is limited. Therefore, there is an urgent need to develop an efficient and rapid computational method for accurately predicting AFPs. In this study, we proposed a novel predictor based on transformer-embedding features and ensemble learning for the identification of AFPs, termed VotePLMs-AFP. Firstly, three types of feature descriptors were extracted from pre-trained protein language models (PLMs) during the feature extraction process. Subsequently, we analyzed six combinations generated by these three embeddings to explore the optimal feature set, which was input into the soft voting-based ensemble learning classifier for the identification of AFPs. Finally, we evaluated the model on the two benchmark datasets. The experimental results show that our model achieves high prediction accuracy in 10-fold cross-validation (CV) and independent set testing, outperforming existing state-of-the-art methods. Therefore, our model could serve as an effective tool for predicting AFPs.
ArticleNumber 130721
Author Qi, Dawei
Liu, Taigang
Author_xml – sequence: 1
  givenname: Dawei
  surname: Qi
  fullname: Qi, Dawei
– sequence: 2
  givenname: Taigang
  surname: Liu
  fullname: Liu, Taigang
  email: tgliu@shou.edu.cn
BackLink https://www.ncbi.nlm.nih.gov/pubmed/39426757$$D View this record in MEDLINE/PubMed
BookMark eNp9kctOwzAQRS1UBKXwBwhlySbFjp3EYYFUVTwqFdEFsLX8mCBXiVPsBAm-HlcBlszClmfOHY99T9DEdQ4QOid4TjAprrZzpeQbuHmGMzYnFJcZOUBTwsss5RgXEzTFFLOUkSI_RichbHGMvMqP0DGtWFaUeTlF7rXrYbN-DOnibnOdrAy43tZWy952LunqRO7PHuALkp2PrHUhGYJ1b0nvpQt151vwKbQKjNlna5D94CFEoUnAhVhpIGlAehfLp-iwlk2As599hl7ubp-XD-n66X61XKxTTQnpU1DK8ALjCrjMscozTrOCas4Z1BnLDaWsIqbWqsJSAQXDaQxWalYwI2lOZ-hy7Btnfh8g9KK1QUPTSAfdEES8hbMyLjyiFz_ooFowYudtK_2n-P2jCLAR0L4LwUP9hxAs9laIrRitEHsrxGhFlN2MMojv_LDgRdAWnAZjPehemM7-3-Abge-TnA
Cites_doi 10.1038/s42256-022-00457-9
10.1146/annurev.physiol.63.1.359
10.1093/nar/gkac278
10.1073/pnas.74.6.2589
10.1038/s41598-020-63259-2
10.1016/j.heliyon.2021.e07953
10.2174/1570178615666180816101653
10.3390/md15020027
10.1109/TCBB.2016.2617337
10.1016/j.jtbi.2014.04.006
10.1016/S0968-0004(01)02028-X
10.1038/s41551-018-0304-0
10.1016/j.chemolab.2022.104729
10.1093/bioinformatics/btu739
10.1093/bioinformatics/btq003
10.1016/j.jtbi.2010.10.037
10.1038/s42256-019-0138-9
10.1096/fasebj.4.8.2185972
10.1007/s00232-015-9811-z
10.1093/bib/bbad135
10.1186/s12859-024-05726-5
10.1021/acs.jcim.3c01563
10.1109/TPAMI.2021.3095381
10.1109/ACCESS.2023.3321100
10.1098/rstb.2002.1081
10.1002/(SICI)1097-0134(199707)28:3<405::AID-PROT10>3.0.CO;2-L
10.1093/bib/bbab420
10.1016/j.compbiomed.2021.105006
10.1016/j.ab.2024.115603
10.1093/nar/25.17.3389
10.1038/s41592-019-0437-4
10.1016/j.femsle.2005.02.022
10.1111/1574-6968.12345
10.1186/s12859-024-05884-6
10.3390/ijms13022196
10.1016/j.artmed.2024.102860
10.1073/pnas.2016239118
10.1371/journal.pone.0020445
ContentType Journal Article
Copyright 2024
Copyright © 2024. Published by Elsevier B.V.
Copyright_xml – notice: 2024
– notice: Copyright © 2024. Published by Elsevier B.V.
DBID AAYXX
CITATION
CGR
CUY
CVF
ECM
EIF
NPM
7X8
DOI 10.1016/j.bbagen.2024.130721
DatabaseName CrossRef
Medline
MEDLINE
MEDLINE (Ovid)
MEDLINE
MEDLINE
PubMed
MEDLINE - Academic
DatabaseTitle CrossRef
MEDLINE
Medline Complete
MEDLINE with Full Text
PubMed
MEDLINE (Ovid)
MEDLINE - Academic
DatabaseTitleList MEDLINE - Academic

MEDLINE
Database_xml – sequence: 1
  dbid: NPM
  name: PubMed
  url: https://proxy.k.utb.cz/login?url=http://www.ncbi.nlm.nih.gov/entrez/query.fcgi?db=PubMed
  sourceTypes: Index Database
– sequence: 2
  dbid: EIF
  name: MEDLINE
  url: https://proxy.k.utb.cz/login?url=https://www.webofscience.com/wos/medline/basic-search
  sourceTypes: Index Database
DeliveryMethod fulltext_linktorsrc
Discipline Chemistry
Biology
EISSN 1872-8006
ExternalDocumentID 39426757
10_1016_j_bbagen_2024_130721
S0304416524001648
Genre Research Support, Non-U.S. Gov't
Journal Article
GroupedDBID ---
--K
--M
.~1
0R~
1B1
1RT
1~.
1~5
23N
3O-
4.4
457
4G.
53G
5GY
5RE
5VS
7-5
71M
8P~
9JM
AACTN
AAEDT
AAEDW
AAHBH
AAIKJ
AAKOC
AALRI
AAOAW
AAQFI
AAQXK
AAXKI
AAXUO
ABEFU
ABFNM
ABGSF
ABMAC
ABUDA
ABXDB
ACDAQ
ACIUM
ACRLP
ADBBV
ADEZE
ADMUD
ADUVX
AEBSH
AEHWI
AEKER
AFJKZ
AFKWA
AFTJW
AFXIZ
AGHFR
AGRDE
AGUBO
AGYEJ
AHHHB
AIEXJ
AIKHN
AITUG
AJOXV
AKRWK
ALMA_UNASSIGNED_HOLDINGS
AMFUW
AMRAJ
ASPBG
AVWKF
AXJTR
AZFZN
BKOJK
BLXMC
CS3
EBS
EFJIC
EJD
EO8
EO9
EP2
EP3
FDB
FEDTE
FGOYB
FIRID
FNPLU
FYGXN
G-2
G-Q
GBLVA
HLW
HVGLF
HZ~
IHE
J1W
KOM
LX3
M41
MO0
N9A
O-L
O9-
OAUVE
OHT
OZT
P-8
P-9
PC.
Q38
R2-
ROL
RPZ
SBG
SCC
SDF
SDG
SDP
SES
SEW
SPCBC
SSU
SSZ
T5K
UQL
WH7
WUQ
XJT
XPP
~G-
AATTM
AAYWO
AAYXX
ABWVN
ACRPL
ACVFH
ADCNI
ADNMO
AEIPS
AEUPX
AFPUW
AGCQF
AGQPQ
AGRNS
AIGII
AIIUN
AKBMS
AKYEP
ANKPU
APXCP
BNPGV
CITATION
SSH
CGR
CUY
CVF
ECM
EIF
NPM
7X8
ID FETCH-LOGICAL-c311t-ebbd86009e8a50b5283263c884ef245d33491dfcb90abe3ed8333347c464da353
IEDL.DBID .~1
ISSN 0304-4165
1872-8006
IngestDate Fri Jul 11 15:34:26 EDT 2025
Fri May 30 10:59:47 EDT 2025
Tue Jul 01 00:22:20 EDT 2025
Sat Nov 09 16:00:11 EST 2024
IsPeerReviewed true
IsScholarly true
Issue 12
Keywords Ensemble learning
Protein language models
Soft voting
Antifreeze proteins
Machine learning
Language English
License Copyright © 2024. Published by Elsevier B.V.
LinkModel DirectLink
MergedId FETCHMERGED-LOGICAL-c311t-ebbd86009e8a50b5283263c884ef245d33491dfcb90abe3ed8333347c464da353
Notes ObjectType-Article-1
SourceType-Scholarly Journals-1
ObjectType-Feature-2
content type line 23
PMID 39426757
PQID 3118471188
PQPubID 23479
ParticipantIDs proquest_miscellaneous_3118471188
pubmed_primary_39426757
crossref_primary_10_1016_j_bbagen_2024_130721
elsevier_sciencedirect_doi_10_1016_j_bbagen_2024_130721
ProviderPackageCode CITATION
AAYXX
PublicationCentury 2000
PublicationDate December 2024
2024-12-00
20241201
PublicationDateYYYYMMDD 2024-12-01
PublicationDate_xml – month: 12
  year: 2024
  text: December 2024
PublicationDecade 2020
PublicationPlace Netherlands
PublicationPlace_xml – name: Netherlands
PublicationTitle Biochimica et biophysica acta. General subjects
PublicationTitleAlternate Biochim Biophys Acta Gen Subj
PublicationYear 2024
Publisher Elsevier B.V
Publisher_xml – name: Elsevier B.V
References Davies (bb0020) 2002; 357
Finn (bb0180) 2014; 42
He (bb0090) 2015; 248
Gilbert, Davies, Laybourn-Parry (bb0005) 2005; 245
Zhao, Ma, Yin (bb0085) 2012; 13
Akbar, Raza, Zou (bb0215) 2024; 25
Du (bb0195) 2023; 24
Rukh (bb0210) 2024; 25
Khan (bb0105) 2018; 15
Qi, Song, Liu (bb0125) 2024; 694
Zhang (bb0205) 2022; 23
Fletcher, Hew, Davies (bb0045) 2001; 63
Lundberg (bb0145) 2018; 2
Thumuluri (bb0120) 2022; 50
Lundberg, Lee (bb0135) 2017; 30
Rao (bb0200) 2019; 32
Raza (bb0065) 2023; 63
Ali (bb0050) 2021; 139
Sonnhammer, Eddy, Durbin (bb0150) 1997; 28
Unsal (bb0130) 2022; 4
Yu, Lu (bb0070) 2011; 6
Mondal, Pai (bb0080) 2014; 356
Khan (bb0035) 2023; 232
Huang (bb0160) 2010; 26
Elnaggar (bb0165) 2022; 44
Jia, Davies (bb0015) 2002; 27
Lundberg (bb0140) 2020; 2
Akbar (bb0060) 2024; 151
Singh (bb0025) 2014; 351
Pratiwi (bb0095) 2017; 2017
Vaswani (bb0115) 2017; 30
Altschul (bb0155) 1997; 25
Raymond, DeVries (bb0010) 1977; 74
Rives (bb0170) 2021
Rao (bb0175) 2019
Usman, Khan, Lee (bb0110) 2020; 10
Miyata (bb0220) 2021; 7
Steinegger, Mirdita, Söding (bb0190) 2019; 16
Kim (bb0040) 2017; 15
Kandaswamy (bb0075) 2011; 270
Davies, Hew (bb0030) 1990; 4
Suzek (bb0185) 2015; 31
Akbar (bb0055) 2023; 11
Akbar (bb0100) 2019; 16
Kandaswamy (10.1016/j.bbagen.2024.130721_bb0075) 2011; 270
He (10.1016/j.bbagen.2024.130721_bb0090) 2015; 248
Lundberg (10.1016/j.bbagen.2024.130721_bb0140) 2020; 2
Steinegger (10.1016/j.bbagen.2024.130721_bb0190) 2019; 16
Akbar (10.1016/j.bbagen.2024.130721_bb0060) 2024; 151
Finn (10.1016/j.bbagen.2024.130721_bb0180) 2014; 42
Qi (10.1016/j.bbagen.2024.130721_bb0125) 2024; 694
Gilbert (10.1016/j.bbagen.2024.130721_bb0005) 2005; 245
Vaswani (10.1016/j.bbagen.2024.130721_bb0115) 2017; 30
Davies (10.1016/j.bbagen.2024.130721_bb0020) 2002; 357
Altschul (10.1016/j.bbagen.2024.130721_bb0155) 1997; 25
Jia (10.1016/j.bbagen.2024.130721_bb0015) 2002; 27
Zhao (10.1016/j.bbagen.2024.130721_bb0085) 2012; 13
Rao (10.1016/j.bbagen.2024.130721_bb0175) 2019
Zhang (10.1016/j.bbagen.2024.130721_bb0205) 2022; 23
Sonnhammer (10.1016/j.bbagen.2024.130721_bb0150) 1997; 28
Rao (10.1016/j.bbagen.2024.130721_bb0200) 2019; 32
Ali (10.1016/j.bbagen.2024.130721_bb0050) 2021; 139
Thumuluri (10.1016/j.bbagen.2024.130721_bb0120) 2022; 50
Lundberg (10.1016/j.bbagen.2024.130721_bb0145) 2018; 2
Raymond (10.1016/j.bbagen.2024.130721_bb0010) 1977; 74
Pratiwi (10.1016/j.bbagen.2024.130721_bb0095) 2017; 2017
Unsal (10.1016/j.bbagen.2024.130721_bb0130) 2022; 4
Usman (10.1016/j.bbagen.2024.130721_bb0110) 2020; 10
Elnaggar (10.1016/j.bbagen.2024.130721_bb0165) 2022; 44
Khan (10.1016/j.bbagen.2024.130721_bb0105) 2018; 15
Raza (10.1016/j.bbagen.2024.130721_bb0065) 2023; 63
Akbar (10.1016/j.bbagen.2024.130721_bb0055) 2023; 11
Akbar (10.1016/j.bbagen.2024.130721_bb0100) 2019; 16
Rukh (10.1016/j.bbagen.2024.130721_bb0210) 2024; 25
Lundberg (10.1016/j.bbagen.2024.130721_bb0135) 2017; 30
Suzek (10.1016/j.bbagen.2024.130721_bb0185) 2015; 31
Akbar (10.1016/j.bbagen.2024.130721_bb0215) 2024; 25
Miyata (10.1016/j.bbagen.2024.130721_bb0220) 2021; 7
Singh (10.1016/j.bbagen.2024.130721_bb0025) 2014; 351
Davies (10.1016/j.bbagen.2024.130721_bb0030) 1990; 4
Fletcher (10.1016/j.bbagen.2024.130721_bb0045) 2001; 63
Du (10.1016/j.bbagen.2024.130721_bb0195) 2023; 24
Yu (10.1016/j.bbagen.2024.130721_bb0070) 2011; 6
Rives (10.1016/j.bbagen.2024.130721_bb0170) 2021; 118
Khan (10.1016/j.bbagen.2024.130721_bb0035) 2023; 232
Huang (10.1016/j.bbagen.2024.130721_bb0160) 2010; 26
Kim (10.1016/j.bbagen.2024.130721_bb0040) 2017; 15
Mondal (10.1016/j.bbagen.2024.130721_bb0080) 2014; 356
References_xml – volume: 28
  start-page: 405
  year: 1997
  end-page: 420
  ident: bb0150
  article-title: Pfam: a comprehensive database of protein domain families based on seed alignments
  publication-title: Proteins
– volume: 248
  start-page: 1005
  year: 2015
  end-page: 1014
  ident: bb0090
  article-title: TargetFreeze: identifying antifreeze proteins via a combination of weights using sequence evolutionary Information and Pseudo amino acid composition
  publication-title: J. Membr. Biol.
– year: 2019
  ident: bb0175
  article-title: Evaluating Protein Transfer Learning with TAPE
  publication-title: 33rd Conference on Neural Information Processing Systems (NeurIPS)
– volume: 25
  start-page: 102
  year: 2024
  ident: bb0215
  article-title: Deepstacked-AVPs: predicting antiviral peptides using tri-segment evolutionary profile and word embedding based multi-perspective features with deep stacking model
  publication-title: BMC Bioinform.
– volume: 27
  start-page: 101
  year: 2002
  end-page: 106
  ident: bb0015
  article-title: Antifreeze proteins: an unusual receptor-ligand interaction
  publication-title: Trends Biochem. Sci.
– volume: 270
  start-page: 56
  year: 2011
  end-page: 62
  ident: bb0075
  article-title: AFP-Pred: a random forest approach for predicting antifreeze proteins from sequence-derived properties
  publication-title: J. Theor. Biol.
– volume: 44
  start-page: 7112
  year: 2022
  end-page: 7127
  ident: bb0165
  article-title: ProtTrans: toward understanding the language of life through self-supervised Learning
  publication-title: IEEE Trans. Pattern Anal. Mach. Intell.
– volume: 16
  start-page: 294
  year: 2019
  end-page: 302
  ident: bb0100
  article-title: iAFP-gap-SMOTE: an efficient feature extraction scheme gapped dipeptide composition is coupled with an oversampling technique for identification of antifreeze proteins
  publication-title: Lett. Org. Chem.
– volume: 25
  start-page: 256
  year: 2024
  ident: bb0210
  article-title: StackedEnC-AOP: prediction of antioxidant proteins using transform evolutionary and sequential features based multi-scale vector with stacked ensemble learning
  publication-title: BMC Bioinform.
– volume: 6
  year: 2011
  ident: bb0070
  article-title: Identification of antifreeze proteins and their functional residues by support vector machine and genetic algorithms based on n-peptide compositions
  publication-title: PLoS One
– volume: 63
  start-page: 6537
  year: 2023
  end-page: 6554
  ident: bb0065
  article-title: AIPs-SnTCN: predicting anti-inflammatory peptides using fastText and transformer encoder-based hybrid word embedding with self-normalized temporal convolutional networks
  publication-title: J. Chem. Inf. Model.
– volume: 11
  start-page: 137099
  year: 2023
  end-page: 137114
  ident: bb0055
  article-title: pAtbP-EnC: identifying anti-tubercular peptides using multi-feature representation and genetic algorithm based deep ensemble model
  publication-title: IEEE Access
– volume: 24
  year: 2023
  ident: bb0195
  article-title: UniDL4BioPep: a universal deep learning architecture for binary classification in peptide bioactivity
  publication-title: Brief. Bioinform.
– volume: 2
  start-page: 749
  year: 2018
  end-page: 760
  ident: bb0145
  article-title: Explainable machine-learning predictions for the prevention of hypoxaemia during surgery
  publication-title: Nat. Biomed. Eng.
– volume: 50
  start-page: W228
  year: 2022
  end-page: w234
  ident: bb0120
  article-title: DeepLoc 2.0: multi-label subcellular localization prediction using protein language models
  publication-title: Nucleic Acids Res.
– volume: 63
  start-page: 359
  year: 2001
  end-page: 390
  ident: bb0045
  article-title: Antifreeze proteins of teleost fishes
  publication-title: Annu. Rev. Physiol.
– volume: 151
  year: 2024
  ident: bb0060
  article-title: iAFPs-mv-BiTCN: predicting antifungal peptides using self-attention transformer embedding and transform evolutionary based multi-view features with bidirectional temporal convolutional networks
  publication-title: Artif. Intell. Med.
– volume: 30
  year: 2017
  ident: bb0135
  article-title: A unified approach to interpreting model predictions
  publication-title: Adv. Neural Inf. Proces. Syst.
– volume: 15
  year: 2017
  ident: bb0040
  article-title: Marine antifreeze proteins: structure, function, and application to cryopreservation as a potential Cryoprotectant
  publication-title: Mar. Drugs
– volume: 139
  year: 2021
  ident: bb0050
  article-title: AFP-CMBPred: computational identification of antifreeze proteins by extending consensus sequences into multi-blocks evolutionary information
  publication-title: Comput. Biol. Med.
– volume: 26
  start-page: 680
  year: 2010
  end-page: 682
  ident: bb0160
  article-title: CD-HIT suite: a web server for clustering and comparing biological sequences
  publication-title: Bioinformatics
– year: 2021
  ident: bb0170
  article-title: Biological structure and function emerge from scaling unsupervised learning to 250 million protein sequences
  publication-title: Proc. Natl. Acad. Sci. USA
– volume: 356
  start-page: 30
  year: 2014
  end-page: 35
  ident: bb0080
  article-title: Chou’s pseudo amino acid composition improves sequence-based antifreeze protein prediction
  publication-title: J. Theor. Biol.
– volume: 74
  start-page: 2589
  year: 1977
  end-page: 2593
  ident: bb0010
  article-title: Adsorption inhibition as a mechanism of freezing resistance in polar fishes
  publication-title: Proc. Natl. Acad. Sci. USA
– volume: 245
  start-page: 67
  year: 2005
  end-page: 72
  ident: bb0005
  article-title: A hyperactive, Ca2+−dependent antifreeze protein in an Antarctic bacterium
  publication-title: FEMS Microbiol. Lett.
– volume: 13
  start-page: 2196
  year: 2012
  end-page: 2207
  ident: bb0085
  article-title: Using support vector machine and evolutionary profiles to predict antifreeze protein sequences
  publication-title: Int. J. Mol. Sci.
– volume: 4
  start-page: 227
  year: 2022
  end-page: 245
  ident: bb0130
  article-title: Learning functional properties of proteins with language models
  publication-title: Nat. Mach. Intell.
– volume: 694
  year: 2024
  ident: bb0125
  article-title: PreDBP-PLMs: prediction of DNA-binding proteins based on pre-trained protein language models and convolutional neural networks
  publication-title: Anal. Biochem.
– volume: 2
  start-page: 56
  year: 2020
  end-page: 67
  ident: bb0140
  article-title: From local explanations to global understanding with explainable AI for trees
  publication-title: Nat. Mach. Intell.
– volume: 15
  start-page: 244
  year: 2018
  end-page: 250
  ident: bb0105
  article-title: RAFP-Pred: robust prediction of antifreeze proteins using localized analysis of n-peptide compositions
  publication-title: IEEE/ACM Trans. Comput. Biol. Bioinform.
– volume: 30
  year: 2017
  ident: bb0115
  article-title: Attention is all you need
  publication-title: Adv. Neural Inf. Proces. Syst.
– volume: 351
  start-page: 14
  year: 2014
  end-page: 22
  ident: bb0025
  article-title: Antifreeze protein activity in Arctic cryoconite bacteria
  publication-title: FEMS Microbiol. Lett.
– volume: 232
  year: 2023
  ident: bb0035
  article-title: Comparative analysis of the existing methods for prediction of antifreeze proteins
  publication-title: Chemom. Intell. Lab. Syst.
– volume: 2017
  year: 2017
  ident: bb0095
  article-title: CryoProtect: a web server for classifying antifreeze proteins from nonantifreeze proteins
  publication-title: J. Chemother.
– volume: 10
  start-page: 7197
  year: 2020
  ident: bb0110
  article-title: AFP-LSE: antifreeze proteins prediction using latent space encoding of composition of k-spaced amino acid pairs
  publication-title: Sci. Rep.
– volume: 25
  start-page: 3389
  year: 1997
  end-page: 3402
  ident: bb0155
  article-title: Gapped BLAST and PSI-BLAST: a new generation of protein database search programs
  publication-title: Nucleic Acids Res.
– volume: 42
  year: 2014
  ident: bb0180
  article-title: Pfam: the protein families database
  publication-title: Nucleic Acids Res.
– volume: 7
  year: 2021
  ident: bb0220
  article-title: Prediction and analysis of antifreeze proteins
  publication-title: Heliyon
– volume: 32
  year: 2019
  ident: bb0200
  article-title: Evaluating protein transfer learning with TAPE
  publication-title: Adv. Neural Inf. Proces. Syst.
– volume: 31
  start-page: 926
  year: 2015
  end-page: 932
  ident: bb0185
  article-title: UniRef clusters: a comprehensive and scalable alternative for improving sequence similarity searches
  publication-title: Bioinformatics
– volume: 357
  start-page: 927
  year: 2002
  end-page: 935
  ident: bb0020
  article-title: Structure and function of antifreeze proteins
  publication-title: Philos. Trans. R. Soc. Lond. Ser. B Biol. Sci.
– volume: 4
  start-page: 2460
  year: 1990
  end-page: 2468
  ident: bb0030
  article-title: Biochemistry of fish antifreeze proteins
  publication-title: FASEB J.
– volume: 16
  start-page: 603
  year: 2019
  end-page: 606
  ident: bb0190
  article-title: Protein-level assembly increases protein sequence recovery from metagenomic samples manyfold
  publication-title: Nat. Methods
– volume: 23
  year: 2022
  ident: bb0205
  article-title: T4SEfinder: a bioinformatics tool for genome-scale prediction of bacterial type IV secreted effectors using pre-trained protein language model
  publication-title: Brief. Bioinform.
– volume: 4
  start-page: 227
  issue: 3
  year: 2022
  ident: 10.1016/j.bbagen.2024.130721_bb0130
  article-title: Learning functional properties of proteins with language models
  publication-title: Nat. Mach. Intell.
  doi: 10.1038/s42256-022-00457-9
– volume: 32
  year: 2019
  ident: 10.1016/j.bbagen.2024.130721_bb0200
  article-title: Evaluating protein transfer learning with TAPE
  publication-title: Adv. Neural Inf. Proces. Syst.
– volume: 63
  start-page: 359
  year: 2001
  ident: 10.1016/j.bbagen.2024.130721_bb0045
  article-title: Antifreeze proteins of teleost fishes
  publication-title: Annu. Rev. Physiol.
  doi: 10.1146/annurev.physiol.63.1.359
– volume: 50
  start-page: W228
  issue: W1
  year: 2022
  ident: 10.1016/j.bbagen.2024.130721_bb0120
  article-title: DeepLoc 2.0: multi-label subcellular localization prediction using protein language models
  publication-title: Nucleic Acids Res.
  doi: 10.1093/nar/gkac278
– volume: 74
  start-page: 2589
  issue: 6
  year: 1977
  ident: 10.1016/j.bbagen.2024.130721_bb0010
  article-title: Adsorption inhibition as a mechanism of freezing resistance in polar fishes
  publication-title: Proc. Natl. Acad. Sci. USA
  doi: 10.1073/pnas.74.6.2589
– volume: 10
  start-page: 7197
  issue: 1
  year: 2020
  ident: 10.1016/j.bbagen.2024.130721_bb0110
  article-title: AFP-LSE: antifreeze proteins prediction using latent space encoding of composition of k-spaced amino acid pairs
  publication-title: Sci. Rep.
  doi: 10.1038/s41598-020-63259-2
– volume: 2017
  issue: 1
  year: 2017
  ident: 10.1016/j.bbagen.2024.130721_bb0095
  article-title: CryoProtect: a web server for classifying antifreeze proteins from nonantifreeze proteins
  publication-title: J. Chemother.
– volume: 7
  issue: 9
  year: 2021
  ident: 10.1016/j.bbagen.2024.130721_bb0220
  article-title: Prediction and analysis of antifreeze proteins
  publication-title: Heliyon
  doi: 10.1016/j.heliyon.2021.e07953
– volume: 16
  start-page: 294
  issue: 4
  year: 2019
  ident: 10.1016/j.bbagen.2024.130721_bb0100
  article-title: iAFP-gap-SMOTE: an efficient feature extraction scheme gapped dipeptide composition is coupled with an oversampling technique for identification of antifreeze proteins
  publication-title: Lett. Org. Chem.
  doi: 10.2174/1570178615666180816101653
– volume: 15
  issue: 2
  year: 2017
  ident: 10.1016/j.bbagen.2024.130721_bb0040
  article-title: Marine antifreeze proteins: structure, function, and application to cryopreservation as a potential Cryoprotectant
  publication-title: Mar. Drugs
  doi: 10.3390/md15020027
– volume: 15
  start-page: 244
  issue: 1
  year: 2018
  ident: 10.1016/j.bbagen.2024.130721_bb0105
  article-title: RAFP-Pred: robust prediction of antifreeze proteins using localized analysis of n-peptide compositions
  publication-title: IEEE/ACM Trans. Comput. Biol. Bioinform.
  doi: 10.1109/TCBB.2016.2617337
– volume: 356
  start-page: 30
  year: 2014
  ident: 10.1016/j.bbagen.2024.130721_bb0080
  article-title: Chou’s pseudo amino acid composition improves sequence-based antifreeze protein prediction
  publication-title: J. Theor. Biol.
  doi: 10.1016/j.jtbi.2014.04.006
– volume: 27
  start-page: 101
  issue: 2
  year: 2002
  ident: 10.1016/j.bbagen.2024.130721_bb0015
  article-title: Antifreeze proteins: an unusual receptor-ligand interaction
  publication-title: Trends Biochem. Sci.
  doi: 10.1016/S0968-0004(01)02028-X
– volume: 2
  start-page: 749
  issue: 10
  year: 2018
  ident: 10.1016/j.bbagen.2024.130721_bb0145
  article-title: Explainable machine-learning predictions for the prevention of hypoxaemia during surgery
  publication-title: Nat. Biomed. Eng.
  doi: 10.1038/s41551-018-0304-0
– volume: 232
  year: 2023
  ident: 10.1016/j.bbagen.2024.130721_bb0035
  article-title: Comparative analysis of the existing methods for prediction of antifreeze proteins
  publication-title: Chemom. Intell. Lab. Syst.
  doi: 10.1016/j.chemolab.2022.104729
– volume: 31
  start-page: 926
  issue: 6
  year: 2015
  ident: 10.1016/j.bbagen.2024.130721_bb0185
  article-title: UniRef clusters: a comprehensive and scalable alternative for improving sequence similarity searches
  publication-title: Bioinformatics
  doi: 10.1093/bioinformatics/btu739
– volume: 26
  start-page: 680
  issue: 5
  year: 2010
  ident: 10.1016/j.bbagen.2024.130721_bb0160
  article-title: CD-HIT suite: a web server for clustering and comparing biological sequences
  publication-title: Bioinformatics
  doi: 10.1093/bioinformatics/btq003
– volume: 270
  start-page: 56
  issue: 1
  year: 2011
  ident: 10.1016/j.bbagen.2024.130721_bb0075
  article-title: AFP-Pred: a random forest approach for predicting antifreeze proteins from sequence-derived properties
  publication-title: J. Theor. Biol.
  doi: 10.1016/j.jtbi.2010.10.037
– volume: 2
  start-page: 56
  issue: 1
  year: 2020
  ident: 10.1016/j.bbagen.2024.130721_bb0140
  article-title: From local explanations to global understanding with explainable AI for trees
  publication-title: Nat. Mach. Intell.
  doi: 10.1038/s42256-019-0138-9
– volume: 4
  start-page: 2460
  issue: 8
  year: 1990
  ident: 10.1016/j.bbagen.2024.130721_bb0030
  article-title: Biochemistry of fish antifreeze proteins
  publication-title: FASEB J.
  doi: 10.1096/fasebj.4.8.2185972
– volume: 248
  start-page: 1005
  issue: 6
  year: 2015
  ident: 10.1016/j.bbagen.2024.130721_bb0090
  article-title: TargetFreeze: identifying antifreeze proteins via a combination of weights using sequence evolutionary Information and Pseudo amino acid composition
  publication-title: J. Membr. Biol.
  doi: 10.1007/s00232-015-9811-z
– volume: 24
  issue: 3
  year: 2023
  ident: 10.1016/j.bbagen.2024.130721_bb0195
  article-title: UniDL4BioPep: a universal deep learning architecture for binary classification in peptide bioactivity
  publication-title: Brief. Bioinform.
  doi: 10.1093/bib/bbad135
– volume: 25
  start-page: 102
  issue: 1
  year: 2024
  ident: 10.1016/j.bbagen.2024.130721_bb0215
  article-title: Deepstacked-AVPs: predicting antiviral peptides using tri-segment evolutionary profile and word embedding based multi-perspective features with deep stacking model
  publication-title: BMC Bioinform.
  doi: 10.1186/s12859-024-05726-5
– volume: 30
  year: 2017
  ident: 10.1016/j.bbagen.2024.130721_bb0135
  article-title: A unified approach to interpreting model predictions
  publication-title: Adv. Neural Inf. Proces. Syst.
– volume: 63
  start-page: 6537
  issue: 21
  year: 2023
  ident: 10.1016/j.bbagen.2024.130721_bb0065
  article-title: AIPs-SnTCN: predicting anti-inflammatory peptides using fastText and transformer encoder-based hybrid word embedding with self-normalized temporal convolutional networks
  publication-title: J. Chem. Inf. Model.
  doi: 10.1021/acs.jcim.3c01563
– volume: 44
  start-page: 7112
  issue: 10
  year: 2022
  ident: 10.1016/j.bbagen.2024.130721_bb0165
  article-title: ProtTrans: toward understanding the language of life through self-supervised Learning
  publication-title: IEEE Trans. Pattern Anal. Mach. Intell.
  doi: 10.1109/TPAMI.2021.3095381
– volume: 11
  start-page: 137099
  year: 2023
  ident: 10.1016/j.bbagen.2024.130721_bb0055
  article-title: pAtbP-EnC: identifying anti-tubercular peptides using multi-feature representation and genetic algorithm based deep ensemble model
  publication-title: IEEE Access
  doi: 10.1109/ACCESS.2023.3321100
– volume: 357
  start-page: 927
  issue: 1423
  year: 2002
  ident: 10.1016/j.bbagen.2024.130721_bb0020
  article-title: Structure and function of antifreeze proteins
  publication-title: Philos. Trans. R. Soc. Lond. Ser. B Biol. Sci.
  doi: 10.1098/rstb.2002.1081
– volume: 28
  start-page: 405
  issue: 3
  year: 1997
  ident: 10.1016/j.bbagen.2024.130721_bb0150
  article-title: Pfam: a comprehensive database of protein domain families based on seed alignments
  publication-title: Proteins
  doi: 10.1002/(SICI)1097-0134(199707)28:3<405::AID-PROT10>3.0.CO;2-L
– volume: 42
  issue: Database issue
  year: 2014
  ident: 10.1016/j.bbagen.2024.130721_bb0180
  article-title: Pfam: the protein families database
  publication-title: Nucleic Acids Res.
– volume: 23
  issue: 1
  year: 2022
  ident: 10.1016/j.bbagen.2024.130721_bb0205
  article-title: T4SEfinder: a bioinformatics tool for genome-scale prediction of bacterial type IV secreted effectors using pre-trained protein language model
  publication-title: Brief. Bioinform.
  doi: 10.1093/bib/bbab420
– volume: 139
  year: 2021
  ident: 10.1016/j.bbagen.2024.130721_bb0050
  article-title: AFP-CMBPred: computational identification of antifreeze proteins by extending consensus sequences into multi-blocks evolutionary information
  publication-title: Comput. Biol. Med.
  doi: 10.1016/j.compbiomed.2021.105006
– volume: 694
  year: 2024
  ident: 10.1016/j.bbagen.2024.130721_bb0125
  article-title: PreDBP-PLMs: prediction of DNA-binding proteins based on pre-trained protein language models and convolutional neural networks
  publication-title: Anal. Biochem.
  doi: 10.1016/j.ab.2024.115603
– volume: 25
  start-page: 3389
  issue: 17
  year: 1997
  ident: 10.1016/j.bbagen.2024.130721_bb0155
  article-title: Gapped BLAST and PSI-BLAST: a new generation of protein database search programs
  publication-title: Nucleic Acids Res.
  doi: 10.1093/nar/25.17.3389
– year: 2019
  ident: 10.1016/j.bbagen.2024.130721_bb0175
  article-title: Evaluating Protein Transfer Learning with TAPE
– volume: 30
  year: 2017
  ident: 10.1016/j.bbagen.2024.130721_bb0115
  article-title: Attention is all you need
  publication-title: Adv. Neural Inf. Proces. Syst.
– volume: 16
  start-page: 603
  issue: 7
  year: 2019
  ident: 10.1016/j.bbagen.2024.130721_bb0190
  article-title: Protein-level assembly increases protein sequence recovery from metagenomic samples manyfold
  publication-title: Nat. Methods
  doi: 10.1038/s41592-019-0437-4
– volume: 245
  start-page: 67
  issue: 1
  year: 2005
  ident: 10.1016/j.bbagen.2024.130721_bb0005
  article-title: A hyperactive, Ca2+−dependent antifreeze protein in an Antarctic bacterium
  publication-title: FEMS Microbiol. Lett.
  doi: 10.1016/j.femsle.2005.02.022
– volume: 351
  start-page: 14
  issue: 1
  year: 2014
  ident: 10.1016/j.bbagen.2024.130721_bb0025
  article-title: Antifreeze protein activity in Arctic cryoconite bacteria
  publication-title: FEMS Microbiol. Lett.
  doi: 10.1111/1574-6968.12345
– volume: 25
  start-page: 256
  issue: 1
  year: 2024
  ident: 10.1016/j.bbagen.2024.130721_bb0210
  article-title: StackedEnC-AOP: prediction of antioxidant proteins using transform evolutionary and sequential features based multi-scale vector with stacked ensemble learning
  publication-title: BMC Bioinform.
  doi: 10.1186/s12859-024-05884-6
– volume: 13
  start-page: 2196
  issue: 2
  year: 2012
  ident: 10.1016/j.bbagen.2024.130721_bb0085
  article-title: Using support vector machine and evolutionary profiles to predict antifreeze protein sequences
  publication-title: Int. J. Mol. Sci.
  doi: 10.3390/ijms13022196
– volume: 151
  year: 2024
  ident: 10.1016/j.bbagen.2024.130721_bb0060
  article-title: iAFPs-mv-BiTCN: predicting antifungal peptides using self-attention transformer embedding and transform evolutionary based multi-view features with bidirectional temporal convolutional networks
  publication-title: Artif. Intell. Med.
  doi: 10.1016/j.artmed.2024.102860
– volume: 118
  issue: 15
  year: 2021
  ident: 10.1016/j.bbagen.2024.130721_bb0170
  article-title: Biological structure and function emerge from scaling unsupervised learning to 250 million protein sequences
  publication-title: Proc. Natl. Acad. Sci. USA
  doi: 10.1073/pnas.2016239118
– volume: 6
  issue: 5
  year: 2011
  ident: 10.1016/j.bbagen.2024.130721_bb0070
  article-title: Identification of antifreeze proteins and their functional residues by support vector machine and genetic algorithms based on n-peptide compositions
  publication-title: PLoS One
  doi: 10.1371/journal.pone.0020445
SSID ssj0000595
Score 2.4529693
Snippet Antifreeze proteins (AFPs) are a unique class of biomolecules capable of protecting other proteins, cell membranes, and cellular structures within organisms...
SourceID proquest
pubmed
crossref
elsevier
SourceType Aggregation Database
Index Database
Publisher
StartPage 130721
SubjectTerms Algorithms
Antifreeze proteins
Antifreeze Proteins - chemistry
Computational Biology - methods
Databases, Protein
Ensemble learning
Machine Learning
Protein language models
Soft voting
Title VotePLMs-AFP: Identification of antifreeze proteins using transformer-embedding features and ensemble learning
URI https://dx.doi.org/10.1016/j.bbagen.2024.130721
https://www.ncbi.nlm.nih.gov/pubmed/39426757
https://www.proquest.com/docview/3118471188
Volume 1868
hasFullText 1
inHoldings 1
isFullTextHit
isPrint
link http://utb.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwnV1LT9wwEB4hKtReUAt9LG2Rkbia3cTOi9tq1dWWl5BaEDfLjierrdos2g0HOPDbOxMnVD2gSs3NSaw4M_bMZ88L4DCuCIb4OJNl5azUtJpkXhJDfMnl4EeIccUW3fOLdHalT26Smw2Y9LEw7FbZyf4g01tp3d0ZdtQc3i4Ww29s1CM4kbAXJIF-DvjVOuNZfvT4x82D4EMSLAla8tt9-Fzr4-UcLVrOghprLoucxdFz6uk5-Nmqoelr2O7woxiHIb6BDax3YCtUlLzfgZeTvoDbLtTXywYvz87Xcjy9PBYhJLfqzujEshKW2yvEBxRtuoZFvRbsBz8XTQ9ncSXxl0PPGk5U2GYBXVNHL2j7S09-oujqTszfwtX0y_fJTHblFWSpoqiR6JzPCe8UmNtk5DjLS5wqYpTGKtaJV0oXka9KV4ysQ4U-V3TprNSp9lYl6h1s1ssaP4CwVvsktTaKHCGSuMwzW2DpaauYes44NwDZU9Xchiwapncv-2ECFwxzwQQuDCDrSW_-mg2GBP0_eh70nDJEbrZ-2BqXd2tD_8yaOMrzAbwPLHwaiyoIqGRJtvff3_0Ir7gVHF0-wWazusPPBFcat9_Ox314Mf56Orv4Dbwo6Yc
linkProvider Elsevier
linkToHtml http://utb.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwnV1LT9wwEB5RUEUvVaGvLS11pV7d3cTOixtasVpgFyEVKm6WHU_QVpBFu-HQHvrbOxMnrTigSs0tcaI489kznzPjGYDPcUU0xMeZLCtnpabZJPOSAPEll4MfIcYVe3TnZ-n0Up9cJVcbMO73wnBYZaf7g05vtXV3ZdhJc3i3WAy_slOP6ETCUZBE-vMnsKVp-nIZgy-__sZ5EH9IgitBS7693z_XBnk5R7OW06DGmusiZ3H0mH16jH-2dmjyAp53BFIchj7uwAbWu_A0lJT8sQvb476C20uovy0bPJ_N1_Jwcn4gwp7cqvtJJ5aVsHy-QvyJos3XsKjXggPhr0XT81lcSbx16NnEiQrbNKBretALWv9Syw2KrvDE9Su4nBxdjKeyq68gSxVFjUTnfE6Ep8DcJiPHaV7iVBFSGqtYJ14pXUS-Kl0xsg4V-lzRobNSp9pblajXsFkva3wLwlrtk9TaKHJESeIyz2yBpae1Yuo55dwAZC9VcxfSaJg-vuy7CSgYRsEEFAaQ9aI3D4aDIU3_jyc_9UgZEje7P2yNy_u1oW9mUxzl-QDeBAj_9EUVxFSyJHv33-_9CNvTi_nMzI7PTvfgGbeEqJf3sNms7vEDcZfG7bdj8zcgNesV
openUrl ctx_ver=Z39.88-2004&ctx_enc=info%3Aofi%2Fenc%3AUTF-8&rfr_id=info%3Asid%2Fsummon.serialssolutions.com&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Ajournal&rft.genre=article&rft.atitle=VotePLMs-AFP%3A+Identification+of+antifreeze+proteins+using+transformer-embedding+features+and+ensemble+learning&rft.jtitle=Biochimica+et+biophysica+acta.+General+subjects&rft.au=Qi%2C+Dawei&rft.au=Liu%2C+Taigang&rft.date=2024-12-01&rft.eissn=1872-8006&rft.volume=1868&rft.issue=12&rft.spage=130721&rft_id=info:doi/10.1016%2Fj.bbagen.2024.130721&rft_id=info%3Apmid%2F39426757&rft.externalDocID=39426757
thumbnail_l http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/lc.gif&issn=0304-4165&client=summon
thumbnail_m http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/mc.gif&issn=0304-4165&client=summon
thumbnail_s http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/sc.gif&issn=0304-4165&client=summon