A critical review of five machine learning-based algorithms for predicting protein stability changes upon mutation

Abstract A number of machine learning (ML)-based algorithms have been proposed for predicting mutation-induced stability changes in proteins. In this critical review, we used hypothetical reverse mutations to evaluate the performance of five representative algorithms and found all of them suffer fro...

Full description

Saved in:
Bibliographic Details
Published inBriefings in bioinformatics Vol. 21; no. 4; pp. 1285 - 1292
Main Author Fang, Jianwen
Format Journal Article
LanguageEnglish
Published England Oxford University Press 15.07.2020
Oxford Publishing Limited (England)
Subjects
Online AccessGet full text

Cover

Loading…
Abstract Abstract A number of machine learning (ML)-based algorithms have been proposed for predicting mutation-induced stability changes in proteins. In this critical review, we used hypothetical reverse mutations to evaluate the performance of five representative algorithms and found all of them suffer from the problem of overfitting. This approach is based on the fact that if a wild-type protein is more stable than a mutant protein, then the same mutant is less stable than the wild-type protein. We analyzed the underlying issues and suggest that the main causes of the overfitting problem include that the numbers of training cases were too small, and the features used in the models were not sufficiently informative for the task. We make recommendations on how to avoid overfitting in this important research area and improve the reliability and robustness of ML-based algorithms in general.
AbstractList Abstract A number of machine learning (ML)-based algorithms have been proposed for predicting mutation-induced stability changes in proteins. In this critical review, we used hypothetical reverse mutations to evaluate the performance of five representative algorithms and found all of them suffer from the problem of overfitting. This approach is based on the fact that if a wild-type protein is more stable than a mutant protein, then the same mutant is less stable than the wild-type protein. We analyzed the underlying issues and suggest that the main causes of the overfitting problem include that the numbers of training cases were too small, and the features used in the models were not sufficiently informative for the task. We make recommendations on how to avoid overfitting in this important research area and improve the reliability and robustness of ML-based algorithms in general.
A number of machine learning (ML)-based algorithms have been proposed for predicting mutation-induced stability changes in proteins. In this critical review, we used hypothetical reverse mutations to evaluate the performance of five representative algorithms and found all of them suffer from the problem of overfitting. This approach is based on the fact that if a wild-type protein is more stable than a mutant protein, then the same mutant is less stable than the wild-type protein. We analyzed the underlying issues and suggest that the main causes of the overfitting problem include that the numbers of training cases were too small, and the features used in the models were not sufficiently informative for the task. We make recommendations on how to avoid overfitting in this important research area and improve the reliability and robustness of ML-based algorithms in general.
Abstract A number of machine learning (ML)-based algorithms have been proposed for predicting mutation-induced stability changes in proteins. In this critical review, we used hypothetical reverse mutations to evaluate the performance of five representative algorithms and found all of them suffer from the problem of overfitting. This approach is based on the fact that if a wild-type protein is more stable than a mutant protein, then the same mutant is less stable than the wild-type protein. We analyzed the underlying issues and suggest that the main causes of the overfitting problem include that the numbers of training cases were too small, and the features used in the models were not sufficiently informative for the task. We make recommendations on how to avoid overfitting in this important research area and improve the reliability and robustness of ML-based algorithms in general.
Author Fang, Jianwen
AuthorAffiliation Computational & Systems Biology Branch , Biometric Research Program, Division of Cancer Treatment and Diagnosis, National Cancer Institute, 9609 Medical Center Drive, Rockville, MD 20850, USA
AuthorAffiliation_xml – name: Computational & Systems Biology Branch , Biometric Research Program, Division of Cancer Treatment and Diagnosis, National Cancer Institute, 9609 Medical Center Drive, Rockville, MD 20850, USA
Author_xml – sequence: 1
  givenname: Jianwen
  surname: Fang
  fullname: Fang, Jianwen
  email: jianwen.fang@nih.gov
  organization: Computational & Systems Biology Branch, Biometric Research Program, Division of Cancer Treatment and Diagnosis, National Cancer Institute, 9609 Medical Center Drive, Rockville, MD 20850, USA
BackLink https://www.ncbi.nlm.nih.gov/pubmed/31273374$$D View this record in MEDLINE/PubMed
BookMark eNp9kVtrFTEUhYNU7EVf_AESEKEIY3OZJGdehFLqBQq-6HPIZc85KTPJmGSO1F9v5NRSffBpb1jfXqzNOkVHMUVA6CUl7ygZ-IUN9sLan0TRJ-iE9kp1PRH90aP9GJ2WcksII2pDn6FjTpniXPUnKF9il0MNzkw4wz7AD5xGPIY94Nm4XYiAJzA5hrjtrCngsZm2qV3s5oLHlPGSwQdXm97WVCFEXKqxYQr1DrudiVsoeF1SxPNaTQ0pPkdPRzMVeHE_z9C3D9dfrz51N18-fr66vOlcz2XtzCBHqRT3ltKeCyEdU-1dSyWR1FAQRIwSBmOIZB4G5cQwgmeDIQPzo_f8DL0_-C6rncE7iDWbSS85zCbf6WSC_luJYae3aa8VV5xu-mZwfm-Q0_cVStVzKA6myURIa9GMCc4UEZQ29PU_6G1ac2zvadYLuRkEE7JRbw-Uy6mUDONDGEr07yp1q1Ifqmzwq8fxH9A_3TXgzQFI6_I_o18NdKtK
CitedBy_id crossref_primary_10_1021_acscatal_3c02743
crossref_primary_10_1016_j_sbi_2021_11_001
crossref_primary_10_1016_j_cels_2021_05_009
crossref_primary_10_1021_acs_jcim_9b00911
crossref_primary_10_1093_bib_bbad357
crossref_primary_10_7554_eLife_82819
crossref_primary_10_1093_nar_gkaa925
crossref_primary_10_1002_jcc_26810
crossref_primary_10_1016_j_biotechadv_2021_107793
crossref_primary_10_3389_fmolb_2021_663301
crossref_primary_10_1016_j_jmps_2023_105531
crossref_primary_10_1002_jcb_30181
crossref_primary_10_1016_j_celrep_2021_110045
crossref_primary_10_3390_ani13182935
crossref_primary_10_3724_abbs_2023033
crossref_primary_10_1002_pmic_202300371
crossref_primary_10_1021_acs_jcim_0c00725
crossref_primary_10_3390_ijms241512073
crossref_primary_10_1002_bit_27980
crossref_primary_10_1111_tan_14725
crossref_primary_10_3389_fbioe_2021_613322
crossref_primary_10_1007_s12033_021_00349_0
crossref_primary_10_1093_bioinformatics_btad671
crossref_primary_10_3390_ijms22020606
crossref_primary_10_1093_bib_bbaa074
crossref_primary_10_1093_bib_bbac570
crossref_primary_10_1093_bib_bbad065
crossref_primary_10_1128_aem_01878_22
crossref_primary_10_1016_j_compchemeng_2024_108585
crossref_primary_10_3390_genes12060911
crossref_primary_10_3389_fbioe_2021_673005
crossref_primary_10_3390_ijms22105408
crossref_primary_10_1093_bib_bbab555
crossref_primary_10_1016_j_compbiolchem_2023_107952
crossref_primary_10_1016_j_future_2024_06_034
crossref_primary_10_1021_acs_jcim_2c00054
crossref_primary_10_1109_ACCESS_2023_3280422
crossref_primary_10_1093_nar_gkac325
crossref_primary_10_1186_s12859_021_04238_w
crossref_primary_10_1080_03461238_2022_2161413
crossref_primary_10_1002_pro_4557
crossref_primary_10_1093_bib_bbad333
crossref_primary_10_3389_fphar_2022_946668
crossref_primary_10_7759_cureus_33592
crossref_primary_10_34133_research_0219
crossref_primary_10_1016_j_csbj_2022_11_009
crossref_primary_10_1021_acs_jpcb_1c04913
crossref_primary_10_1016_j_jmb_2023_168060
crossref_primary_10_1093_bib_bbz168
crossref_primary_10_7554_eLife_82593
crossref_primary_10_3390_cancers15071958
crossref_primary_10_1063_5_0032019
crossref_primary_10_1021_jacs_3c11940
crossref_primary_10_1093_bioinformatics_btad011
crossref_primary_10_1002_pro_4467
crossref_primary_10_1007_s10237_020_01410_8
crossref_primary_10_1002_pro_4861
crossref_primary_10_1021_acs_jcim_0c00064
crossref_primary_10_1093_bib_bbab184
crossref_primary_10_1016_j_compbiomed_2023_107678
crossref_primary_10_1016_j_str_2024_02_016
crossref_primary_10_1016_j_tips_2020_12_004
crossref_primary_10_1021_acs_jcim_2c00799
crossref_primary_10_1016_j_copbio_2023_103047
crossref_primary_10_1080_07391102_2022_2137699
crossref_primary_10_1088_1361_6463_abedfb
Cites_doi 10.1093/nar/gku411
10.1093/nar/gky1100
10.1093/nar/gki375
10.1093/bioinformatics/btw361
10.1080/20014091074174
10.1073/pnas.84.19.6663
10.1016/j.csbj.2018.01.002
10.1016/j.eswa.2008.12.020
10.1002/humu.21242
10.1093/bioinformatics/btt691
10.1016/S0958-1669(99)80070-6
10.1371/journal.pone.0203819
10.1073/pnas.0904191106
10.1016/S0022-2836(03)00233-X
10.1126/science.1107387
10.1186/1471-2105-9-S2-S6
10.1021/bi0600143
10.1093/nar/gkj103
10.1073/pnas.0808220106
10.1093/nar/gky300
10.1093/bioinformatics/btt055
10.1002/pro.8
10.1093/bioinformatics/btn166
10.1007/978-1-59745-367-7_11
10.1002/prot.20810
10.1110/ps.0217002
10.1093/nar/gki387
10.1110/ps.051454805
10.1016/S0959-440X(03)00104-0
10.2174/138920210791616725
10.1002/prot.23163
10.1158/0008-5472.CAN-14-3812
10.3390/ijms19041009
10.1093/bioinformatics/btm345
10.1074/jbc.M501675200
10.1111/j.1742-4658.2007.05954.x
10.1126/science.1079237
10.1021/acs.jpcb.8b08990
10.1002/bip.360261104
10.1073/pnas.86.21.8382
10.1002/jcc.20289
10.1016/S0301-4622(99)00103-9
10.1073/pnas.86.17.6562
10.1093/bioinformatics/btp370
10.1093/bioinformatics/btn353
10.1002/pro.574
10.1002/pro.344
10.1371/journal.pone.0046084
ContentType Journal Article
Copyright The Author(s) 2019. Published by Oxford University Press. 2019
The Author(s) 2019. Published by Oxford University Press.
Copyright_xml – notice: The Author(s) 2019. Published by Oxford University Press. 2019
– notice: The Author(s) 2019. Published by Oxford University Press.
DBID NPM
AAYXX
CITATION
7QO
7SC
8FD
FR3
JQ2
K9.
L7M
L~C
L~D
P64
RC3
7X8
5PM
DOI 10.1093/bib/bbz071
DatabaseName PubMed
CrossRef
Biotechnology Research Abstracts
Computer and Information Systems Abstracts
Technology Research Database
Engineering Research Database
ProQuest Computer Science Collection
ProQuest Health & Medical Complete (Alumni)
Advanced Technologies Database with Aerospace
Computer and Information Systems Abstracts – Academic
Computer and Information Systems Abstracts Professional
Biotechnology and BioEngineering Abstracts
Genetics Abstracts
MEDLINE - Academic
PubMed Central (Full Participant titles)
DatabaseTitle PubMed
CrossRef
Genetics Abstracts
Biotechnology Research Abstracts
Technology Research Database
Computer and Information Systems Abstracts – Academic
ProQuest Computer Science Collection
Computer and Information Systems Abstracts
ProQuest Health & Medical Complete (Alumni)
Engineering Research Database
Advanced Technologies Database with Aerospace
Biotechnology and BioEngineering Abstracts
Computer and Information Systems Abstracts Professional
MEDLINE - Academic
DatabaseTitleList CrossRef

PubMed
Genetics Abstracts

Database_xml – sequence: 1
  dbid: NPM
  name: PubMed
  url: https://proxy.k.utb.cz/login?url=http://www.ncbi.nlm.nih.gov/entrez/query.fcgi?db=PubMed
  sourceTypes: Index Database
DeliveryMethod fulltext_linktorsrc
Discipline Biology
EISSN 1477-4054
EndPage 1292
ExternalDocumentID 10_1093_bib_bbz071
31273374
10.1093/bib/bbz071
Genre Journal Article
GrantInformation_xml – fundername: ; ;
GroupedDBID ---
-E4
.2P
.I3
0R~
1TH
23N
2WC
36B
4.4
48X
53G
5GY
5VS
6J9
70D
8VB
AAHBH
AAIJN
AAIMJ
AAJKP
AAJQQ
AAMDB
AAMVS
AAOGV
AAPQZ
AAPXW
AARHZ
AASNB
AAUQX
AAVAP
AAVLN
ABDBF
ABEUO
ABIXL
ABJNI
ABNKS
ABPTD
ABQLI
ABQTQ
ABWST
ABXVV
ABZBJ
ACGFO
ACGFS
ACGOD
ACIWK
ACPRK
ACUFI
ACYTK
ADBBV
ADEYI
ADFTL
ADGKP
ADGZP
ADHKW
ADHZD
ADOCK
ADPDF
ADQBN
ADRDM
ADRIX
ADRTK
ADVEK
ADYVW
ADZTZ
ADZXQ
AECKG
AEGPL
AEGXH
AEJOX
AEKKA
AEKSI
AELWJ
AEMDU
AEMOZ
AENEX
AENZO
AEPUE
AETBJ
AEWNT
AFFZL
AFGWE
AFIYH
AFOFC
AFRAH
AFXEN
AGINJ
AGKEF
AGQXC
AGSYK
AHMBA
AHXPO
AIAGR
AIJHB
AJEEA
AJEUX
AKHUL
AKVCP
AKWXX
ALMA_UNASSIGNED_HOLDINGS
ALTZX
ALUQC
APIBT
APWMN
ARIXL
AXUDD
AYOIW
AZVOD
BAWUL
BAYMD
BCRHZ
BEYMZ
BHONS
BQDIO
BQUQU
BSWAC
BTQHN
C1A
C45
CAG
CDBKE
COF
CS3
CZ4
DAKXR
DIK
DILTD
DU5
D~K
E3Z
EAD
EAP
EAS
EBA
EBC
EBD
EBR
EBS
EBU
EE~
EJD
EMB
EMK
EMOBN
EST
ESX
F5P
F9B
FHSFR
FLIZI
FLUFQ
FOEOM
FQBLK
GAUVT
GJXCC
GX1
H13
H5~
HAR
HW0
HZ~
IOX
J21
K1G
KBUDW
KOP
KSI
KSN
M-Z
M49
MK~
ML0
N9A
NGC
NLBLG
NMDNZ
NOMLY
NU-
O0~
O9-
OAWHX
ODMLO
OJQWA
OK1
OVD
OVEED
P2P
PAFKI
PEELM
PQQKQ
Q1.
Q5Y
QWB
RD5
ROX
RPM
RUSNO
RW1
RXO
SV3
TEORI
TH9
TJP
TLC
TOX
TR2
TUS
W8F
WOQ
X7H
YAYTL
YKOAZ
YXANX
ZKX
ZL0
~91
NPM
AAYXX
CITATION
7QO
7SC
8FD
FR3
JQ2
K9.
L7M
L~C
L~D
P64
RC3
7X8
5PM
ID FETCH-LOGICAL-c436t-a96f6773db1143556c27109b16061a1e505f6e9aa062de97c59fed29a092dfdd3
IEDL.DBID RPM
ISSN 1477-4054
1467-5463
IngestDate Tue Sep 17 21:09:28 EDT 2024
Fri Aug 16 11:46:11 EDT 2024
Thu Oct 10 19:25:01 EDT 2024
Fri Aug 23 05:12:39 EDT 2024
Wed Oct 16 00:47:59 EDT 2024
Wed Aug 28 03:17:31 EDT 2024
IsDoiOpenAccess false
IsOpenAccess true
IsPeerReviewed true
IsScholarly true
Issue 4
Keywords robustness
mutation
protein stability
reverse mutation
computational prediction
reliability
Language English
License This work is written by US Government employees and is in the public domain in the US.
The Author(s) 2019. Published by Oxford University Press.
LinkModel DirectLink
MergedId FETCHMERGED-LOGICAL-c436t-a96f6773db1143556c27109b16061a1e505f6e9aa062de97c59fed29a092dfdd3
Notes ObjectType-Article-2
SourceType-Scholarly Journals-1
ObjectType-Feature-3
content type line 23
ObjectType-Review-1
OpenAccessLink https://academic.oup.com/bib/article-pdf/21/4/1285/33584078/bbz071.pdf
PMID 31273374
PQID 2456895256
PQPubID 26846
PageCount 8
ParticipantIDs pubmedcentral_primary_oai_pubmedcentral_nih_gov_7373184
proquest_miscellaneous_2253270511
proquest_journals_2456895256
crossref_primary_10_1093_bib_bbz071
pubmed_primary_31273374
oup_primary_10_1093_bib_bbz071
PublicationCentury 2000
PublicationDate 20200715
PublicationDateYYYYMMDD 2020-07-15
PublicationDate_xml – month: 07
  year: 2020
  text: 20200715
  day: 15
PublicationDecade 2020
PublicationPlace England
PublicationPlace_xml – name: England
– name: Oxford
PublicationTitle Briefings in bioinformatics
PublicationTitleAlternate Brief Bioinform
PublicationYear 2020
Publisher Oxford University Press
Oxford Publishing Limited (England)
Publisher_xml – name: Oxford University Press
– name: Oxford Publishing Limited (England)
References Wikipedia (2020080709263705600_ref39)
Pires (2020080709263705600_ref37) 2014; 42
Rodrigues (2020080709263705600_ref52) 2018; 46
Agoston (2020080709263705600_ref10) 2005; 280
Mitchell (2020080709263705600_ref43) 2019; 47
Capriotti (2020080709263705600_ref24) 2008; 9
Dahiyat (2020080709263705600_ref1) 1999; 10
Masso (2020080709263705600_ref18) 2008; 24
Bagowski (2020080709263705600_ref44) 2010; 11
Pires (2020080709263705600_ref36) 2014; 30
Gribenko (2020080709263705600_ref45) 2009; 106
Li (2020080709263705600_ref9) 2016; 76
Cheng (2020080709263705600_ref17) 2006; 62
Schymkowitz (2020080709263705600_ref13) 2005; 33
Glyakina (2020080709263705600_ref23) 2007; 23
Day (2020080709263705600_ref49)
McGuinness (2020080709263705600_ref31) 2018; 13
Montanucci (2020080709263705600_ref19) 2008; 24
Pronk (2020080709263705600_ref54) 2013; 29
Thiltgen (2020080709263705600_ref29) 2012; 7
Korkegian (2020080709263705600_ref2) 2005; 308
Unsworth (2020080709263705600_ref7) 2007; 274
Khan (2020080709263705600_ref30) 2010; 31
Kumar (2020080709263705600_ref40) 2006; 34
Baase (2020080709263705600_ref12) 2010; 19
Schweiker (2020080709263705600_ref4) 2009
Matthews (2020080709263705600_ref25) 1987; 84
Becktel (2020080709263705600_ref38) 1987; 26
Huang (2020080709263705600_ref22) 2009; 25
Sterner (2020080709263705600_ref5) 2001; 36
Quan (2020080709263705600_ref35) 2016; 32
Sheffler (2020080709263705600_ref14) 2009; 18
Li (2020080709263705600_ref32) 2012; 80
Li (2020080709263705600_ref33) 2012; 7
Vapnik (2020080709263705600_ref41) 1998
Fang (2020080709263705600_ref34) 2015; 4
Capriotti (2020080709263705600_ref16) 2005; 33
Matsumura (2020080709263705600_ref27) 1989; 86
Lazar (2020080709263705600_ref3) 2003; 13
Schoemaker (2020080709263705600_ref8) 2003; 299
Gromiha (2020080709263705600_ref21) 1999; 82
Makhatadze (2020080709263705600_ref26) 2003; 327
Gong (2020080709263705600_ref51) 2011; 20
Buss (2020080709263705600_ref28) 2018; 16
Wu (2020080709263705600_ref20) 2009; 36
Sakamoto (2020080709263705600_ref11) 2009
Yang (2020080709263705600_ref42) 2018; 19
Phillips (2020080709263705600_ref53) 2005; 26
Strickler (2020080709263705600_ref46) 2006; 45
Chennamsetty (2020080709263705600_ref6) 2009; 106
Spolar (2020080709263705600_ref48) 1989; 86
Bruno da Silva (2020080709263705600_ref47) 2018; 122
Fleming (2020080709263705600_ref50) 2005; 14
Zhou (2020080709263705600_ref15) 2002; 11
References_xml – volume: 42
  start-page: W314
  year: 2014
  ident: 2020080709263705600_ref37
  article-title: DUET: a server for predicting effects of mutations on protein stability using an integrated computational approach
  publication-title: Nucleic Acids Res
  doi: 10.1093/nar/gku411
  contributor:
    fullname: Pires
– volume: 47
  start-page: D351
  year: 2019
  ident: 2020080709263705600_ref43
  article-title: InterPro in 2019: improving coverage, classification and access to protein sequence annotations
  publication-title: Nucleic Acids Res
  doi: 10.1093/nar/gky1100
  contributor:
    fullname: Mitchell
– volume: 33
  start-page: W306
  year: 2005
  ident: 2020080709263705600_ref16
  article-title: I-Mutant2.0: predicting stability changes upon mutation from the protein sequence or structure
  publication-title: Nucleic Acids Res
  doi: 10.1093/nar/gki375
  contributor:
    fullname: Capriotti
– volume: 32
  start-page: 2936
  year: 2016
  ident: 2020080709263705600_ref35
  article-title: STRUM: structure-based prediction of protein stability changes upon single-point mutation
  publication-title: Bioinformatics
  doi: 10.1093/bioinformatics/btw361
  contributor:
    fullname: Quan
– volume: 36
  start-page: 39
  year: 2001
  ident: 2020080709263705600_ref5
  article-title: Thermophilic adaptation of proteins
  publication-title: Crit Rev Biochem Mol Biol
  doi: 10.1080/20014091074174
  contributor:
    fullname: Sterner
– volume: 84
  start-page: 6663
  year: 1987
  ident: 2020080709263705600_ref25
  article-title: Enhanced protein thermostability from site-directed mutations that decrease the entropy of unfolding
  publication-title: Proc Natl Acad Sci USA
  doi: 10.1073/pnas.84.19.6663
  contributor:
    fullname: Matthews
– volume: 16
  start-page: 25
  year: 2018
  ident: 2020080709263705600_ref28
  article-title: FoldX as protein engineering tool: better than random based approaches?
  publication-title: Comput Struct Biotechnol J
  doi: 10.1016/j.csbj.2018.01.002
  contributor:
    fullname: Buss
– volume: 36
  start-page: 9007
  year: 2009
  ident: 2020080709263705600_ref20
  article-title: An expert system to predict protein thermostability using decision tree
  publication-title: Expert Systems with Applications
  doi: 10.1016/j.eswa.2008.12.020
  contributor:
    fullname: Wu
– volume: 31
  start-page: 675
  year: 2010
  ident: 2020080709263705600_ref30
  article-title: Performance of protein stability predictors
  publication-title: Hum Mutat
  doi: 10.1002/humu.21242
  contributor:
    fullname: Khan
– volume: 30
  start-page: 335
  year: 2014
  ident: 2020080709263705600_ref36
  article-title: mCSM: predicting the effects of mutations in proteins using graph-based signatures
  publication-title: Bioinformatics
  doi: 10.1093/bioinformatics/btt691
  contributor:
    fullname: Pires
– volume-title: Statistical Learning Theory
  year: 1998
  ident: 2020080709263705600_ref41
  contributor:
    fullname: Vapnik
– volume: 10
  start-page: 387
  year: 1999
  ident: 2020080709263705600_ref1
  article-title: In silico design for protein stabilization
  publication-title: Curr Opin Biotechnol
  doi: 10.1016/S0958-1669(99)80070-6
  contributor:
    fullname: Dahiyat
– volume: 13
  year: 2018
  ident: 2020080709263705600_ref31
  article-title: Role of simple descriptors and applicability domain in predicting change in protein thermostability
  publication-title: PLoS One
  doi: 10.1371/journal.pone.0203819
  contributor:
    fullname: McGuinness
– volume: 106
  start-page: 11937
  year: 2009
  ident: 2020080709263705600_ref6
  article-title: Design of therapeutic proteins with enhanced stability
  publication-title: Proc Natl Acad Sci USA
  doi: 10.1073/pnas.0904191106
  contributor:
    fullname: Chennamsetty
– volume: 327
  start-page: 1135
  year: 2003
  ident: 2020080709263705600_ref26
  article-title: Contribution of surface salt bridges to protein stability: guidelines for protein engineering
  publication-title: J Mol Biol
  doi: 10.1016/S0022-2836(03)00233-X
  contributor:
    fullname: Makhatadze
– volume: 308
  start-page: 857
  year: 2005
  ident: 2020080709263705600_ref2
  article-title: Computational thermostabilization of an enzyme
  publication-title: Science
  doi: 10.1126/science.1107387
  contributor:
    fullname: Korkegian
– volume: 9
  start-page: S6
  issue: Suppl 2
  year: 2008
  ident: 2020080709263705600_ref24
  article-title: A three-state prediction of single point mutations on protein stability changes
  publication-title: BMC Bioinformatics
  doi: 10.1186/1471-2105-9-S2-S6
  contributor:
    fullname: Capriotti
– volume: 4
  start-page: e130
  year: 2015
  ident: 2020080709263705600_ref34
  article-title: Reliability of machine learning based algorithms for designing protein drugs with enhanced stability
  publication-title: Drug Designing: Open Access
  contributor:
    fullname: Fang
– volume: 45
  start-page: 2761
  year: 2006
  ident: 2020080709263705600_ref46
  article-title: Protein stability and surface electrostatics: a charged relationship
  publication-title: Biochemistry
  doi: 10.1021/bi0600143
  contributor:
    fullname: Strickler
– volume: 34
  start-page: D204
  year: 2006
  ident: 2020080709263705600_ref40
  article-title: ProTherm and ProNIT: thermodynamic databases for proteins and protein-nucleic acid interactions
  publication-title: Nucleic Acids Res
  doi: 10.1093/nar/gkj103
  contributor:
    fullname: Kumar
– ident: 2020080709263705600_ref39
  contributor:
    fullname: Wikipedia
– volume: 106
  start-page: 2601
  year: 2009
  ident: 2020080709263705600_ref45
  article-title: Rational stabilization of enzymes by computational redesign of surface charge-charge interactions
  publication-title: Proc Natl Acad Sci USA
  doi: 10.1073/pnas.0808220106
  contributor:
    fullname: Gribenko
– volume: 46
  start-page: W350
  year: 2018
  ident: 2020080709263705600_ref52
  article-title: DynaMut: predicting the impact of mutations on protein conformation, flexibility and stability
  publication-title: Nucleic Acids Res
  doi: 10.1093/nar/gky300
  contributor:
    fullname: Rodrigues
– volume: 29
  start-page: 845
  year: 2013
  ident: 2020080709263705600_ref54
  article-title: GROMACS 4.5: a high-throughput and highly parallel open source molecular simulation toolkit
  publication-title: Bioinformatics
  doi: 10.1093/bioinformatics/btt055
  contributor:
    fullname: Pronk
– volume: 18
  start-page: 229
  year: 2009
  ident: 2020080709263705600_ref14
  article-title: RosettaHoles: rapid assessment of protein core packing for structure prediction, refinement, design, and validation
  publication-title: Protein Sci
  doi: 10.1002/pro.8
  contributor:
    fullname: Sheffler
– volume: 24
  start-page: I190
  year: 2008
  ident: 2020080709263705600_ref19
  article-title: Predicting protein thermostability changes from sequence upon multiple mutations
  publication-title: Bioinformatics
  doi: 10.1093/bioinformatics/btn166
  contributor:
    fullname: Montanucci
– start-page: 261
  volume-title: Protein Structure, Stability, and Interactions
  year: 2009
  ident: 2020080709263705600_ref4
  doi: 10.1007/978-1-59745-367-7_11
  contributor:
    fullname: Schweiker
– volume: 62
  start-page: 1125
  year: 2006
  ident: 2020080709263705600_ref17
  article-title: Prediction of protein stability changes for single-site mutations using support vector machines
  publication-title: Proteins
  doi: 10.1002/prot.20810
  contributor:
    fullname: Cheng
– volume: 11
  start-page: 2714
  year: 2002
  ident: 2020080709263705600_ref15
  article-title: Distance-scaled, finite ideal-gas reference state improves structure-derived potentials of mean force for structure selection and stability prediction
  publication-title: Protein Sci
  doi: 10.1110/ps.0217002
  contributor:
    fullname: Zhou
– volume: 33
  start-page: W382
  year: 2005
  ident: 2020080709263705600_ref13
  article-title: The FoldX web server: an online force field
  publication-title: Nucleic Acids Res
  doi: 10.1093/nar/gki387
  contributor:
    fullname: Schymkowitz
– volume: 14
  start-page: 1911
  year: 2005
  ident: 2020080709263705600_ref50
  article-title: Do all backbone polar groups in proteins form hydrogen bonds?
  publication-title: Protein Sci
  doi: 10.1110/ps.051454805
  contributor:
    fullname: Fleming
– volume: 13
  start-page: 513
  year: 2003
  ident: 2020080709263705600_ref3
  article-title: Designing proteins for therapeutic applications
  publication-title: Curr Opin Struct Biol
  doi: 10.1016/S0959-440X(03)00104-0
  contributor:
    fullname: Lazar
– volume: 11
  start-page: 368
  year: 2010
  ident: 2020080709263705600_ref44
  article-title: The nature of protein domain evolution: shaping the interaction network
  publication-title: Curr Genomics
  doi: 10.2174/138920210791616725
  contributor:
    fullname: Bagowski
– volume: 80
  start-page: 81
  year: 2012
  ident: 2020080709263705600_ref32
  article-title: Prots: a fragment based protein thermo-stability potential
  publication-title: Proteins
  doi: 10.1002/prot.23163
  contributor:
    fullname: Li
– volume: 76
  start-page: 561
  year: 2016
  ident: 2020080709263705600_ref9
  article-title: Balancing protein stability and activity in cancer: a new approach for identifying driver mutations affecting CBL ubiquitin ligase activation
  publication-title: Cancer Res
  doi: 10.1158/0008-5472.CAN-14-3812
  contributor:
    fullname: Li
– volume: 7
  year: 2012
  ident: 2020080709263705600_ref33
  article-title: PROTS-RF: a robust model for predicting mutation-induced protein stability changes
  publication-title: PLoS One
  contributor:
    fullname: Li
– volume: 19
  start-page: 1009
  year: 2018
  ident: 2020080709263705600_ref42
  article-title: PON-tstab: protein variant stability predictor. importance of training data quality
  publication-title: Int J Mol Sci
  doi: 10.3390/ijms19041009
  contributor:
    fullname: Yang
– volume: 23
  start-page: 2231
  year: 2007
  ident: 2020080709263705600_ref23
  article-title: Different packing of external residues can explain differences in the thermostability of proteins from thermophilic and mesophilic organisms
  publication-title: Bioinformatics
  doi: 10.1093/bioinformatics/btm345
  contributor:
    fullname: Glyakina
– volume: 280
  start-page: 18302
  year: 2005
  ident: 2020080709263705600_ref10
  article-title: Increased protein stability causes DNA methyltransferase 1 dysregulation in breast cancer
  publication-title: J Biol Chem
  doi: 10.1074/jbc.M501675200
  contributor:
    fullname: Agoston
– volume: 274
  start-page: 4044
  year: 2007
  ident: 2020080709263705600_ref7
  article-title: Hyperthermophilic enzymes—stability, activity and implementation strategies for high temperature applications
  publication-title: FEBS J
  doi: 10.1111/j.1742-4658.2007.05954.x
  contributor:
    fullname: Unsworth
– volume: 299
  start-page: 1694
  year: 2003
  ident: 2020080709263705600_ref8
  article-title: Dispelling the myths—biocatalysis in industrial synthesis
  publication-title: Science
  doi: 10.1126/science.1079237
  contributor:
    fullname: Schoemaker
– ident: 2020080709263705600_ref49
  contributor:
    fullname: Day
– volume: 122
  start-page: 10817
  issue: 48
  year: 2018
  ident: 2020080709263705600_ref47
  article-title: Non-native cooperative interactions modulate protein folding rates
  publication-title: J Phys Chem B
  doi: 10.1021/acs.jpcb.8b08990
  contributor:
    fullname: Bruno da Silva
– volume-title: Modulation of Protein Stability in Cancer Therapy
  year: 2009
  ident: 2020080709263705600_ref11
  contributor:
    fullname: Sakamoto
– volume: 26
  start-page: 1859
  year: 1987
  ident: 2020080709263705600_ref38
  article-title: Protein stability curves
  publication-title: Biopolymers
  doi: 10.1002/bip.360261104
  contributor:
    fullname: Becktel
– volume: 86
  start-page: 8382
  year: 1989
  ident: 2020080709263705600_ref48
  article-title: Hydrophobic effect in protein folding and other noncovalent processes involving proteins
  publication-title: Proc Natl Acad Sci USA
  doi: 10.1073/pnas.86.21.8382
  contributor:
    fullname: Spolar
– volume: 26
  start-page: 1781
  year: 2005
  ident: 2020080709263705600_ref53
  article-title: Scalable molecular dynamics with NAMD
  publication-title: J Comput Chem
  doi: 10.1002/jcc.20289
  contributor:
    fullname: Phillips
– volume: 82
  start-page: 51
  year: 1999
  ident: 2020080709263705600_ref21
  article-title: Important amino acid properties for enhanced thermostability from mesophilic to thermophilic proteins
  publication-title: Biophys Chem
  doi: 10.1016/S0301-4622(99)00103-9
  contributor:
    fullname: Gromiha
– volume: 86
  start-page: 6562
  year: 1989
  ident: 2020080709263705600_ref27
  article-title: Stabilization of phage-T4 lysozyme by engineered disulfide bonds
  publication-title: Proc Natl Acad Sci USA
  doi: 10.1073/pnas.86.17.6562
  contributor:
    fullname: Matsumura
– volume: 25
  start-page: 2181
  year: 2009
  ident: 2020080709263705600_ref22
  article-title: Reliable prediction of protein thermostability change upon double mutation from amino acid sequence
  publication-title: Bioinformatics
  doi: 10.1093/bioinformatics/btp370
  contributor:
    fullname: Huang
– volume: 24
  start-page: 2002
  year: 2008
  ident: 2020080709263705600_ref18
  article-title: Accurate prediction of stability changes in protein mutants by combining machine learning with structure based computational mutagenesis
  publication-title: Bioinformatics
  doi: 10.1093/bioinformatics/btn353
  contributor:
    fullname: Masso
– volume: 20
  start-page: 417
  year: 2011
  ident: 2020080709263705600_ref51
  article-title: Counting peptide-water hydrogen bonds in unfolded proteins
  publication-title: Protein Sci
  doi: 10.1002/pro.574
  contributor:
    fullname: Gong
– volume: 19
  start-page: 631
  year: 2010
  ident: 2020080709263705600_ref12
  article-title: Lessons from the lysozyme of phage T4
  publication-title: Protein Sci
  doi: 10.1002/pro.344
  contributor:
    fullname: Baase
– volume: 7
  year: 2012
  ident: 2020080709263705600_ref29
  article-title: Assessing predictors of changes in protein stability upon mutation using self-consistency
  publication-title: PLoS One
  doi: 10.1371/journal.pone.0046084
  contributor:
    fullname: Thiltgen
SSID ssj0020781
Score 2.5820107
SecondaryResourceType review_article
Snippet Abstract A number of machine learning (ML)-based algorithms have been proposed for predicting mutation-induced stability changes in proteins. In this critical...
A number of machine learning (ML)-based algorithms have been proposed for predicting mutation-induced stability changes in proteins. In this critical review,...
Abstract A number of machine learning (ML)-based algorithms have been proposed for predicting mutation-induced stability changes in proteins. In this critical...
SourceID pubmedcentral
proquest
crossref
pubmed
oup
SourceType Open Access Repository
Aggregation Database
Index Database
Publisher
StartPage 1285
SubjectTerms Algorithms
Learning algorithms
Machine learning
Mutants
Mutation
Proteins
Review
Stability
Title A critical review of five machine learning-based algorithms for predicting protein stability changes upon mutation
URI https://www.ncbi.nlm.nih.gov/pubmed/31273374
https://www.proquest.com/docview/2456895256
https://search.proquest.com/docview/2253270511
https://pubmed.ncbi.nlm.nih.gov/PMC7373184
Volume 21
hasFullText 1
inHoldings 1
isFullTextHit
isPrint
link http://utb.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwnV1Lb9QwEB61RaBeEG8WysoIrmk28Qsft4VVhVTg0Ep7i_xsV2qS1T4O5dczjpNVlwMHLrnYTiLP2PON5psZgM_Mo1ENeJAo9SFjMbdHucJkQqJv4FwIPMRE4csf4uKafZ_z-QHwIRemI-1bszht7urTZnHbcSuXtc0Hnlj-6_JcUomqyPJDOJSUDi5672XF6jUppUhmsdb7UJNU0dwsTG7Mb7Sqx_CEFmi5qWR7Bmkvye0B1vybMvnABs2ewdMePJJp-snncOCbF_A4tZO8fwmrKbF95wKSUlJIG0jA-4zUHWfSk75JxE0WrZcj-u6mxRW39ZogeCXLVQzbRCI06eo3LBqC4LGjz96TlCK8Jttl25B6m0L4r-B69u3q_CLreypkllGxybQSQUhJnSkiUuLClpGNaQp0ZApdeAREQXil9USUzitpuQrelUpPVOmCc_Q1HDVt498CQWBh0f-aaK8147TUzEoqpA94afDA7Ag-DRtbLVPpjCqFvGmFkqiSJEYwxj3_54STQRxVf77WVQzXflEc8doIPu6G8WTEcIdufLvFOSX-lMRLB1_xJklv95lB-COQe3LdTYhVt_dHUBm76tu98r3775Xv4biMTnuszslP4Giz2voPiGw2ZgyPpmdfz2bjTqPxefVz_gfM0P16
link.rule.ids 230,315,733,786,790,891,1611,27955,27956,53825,53827
linkProvider National Library of Medicine
linkToHtml http://utb.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwnV1LT9wwEB5RqrZc-i5sodRVe81mE9txc0QItG1Z1ANU3CI_YVWSrHY3B_j1HcfJiuVQCc52XpoZzzfxN58BvjGLSdVhIFFqXcR8b09uEhVlAmsDY5zjzjcKT06z8Tn7ecEvNoD3vTAtaV-r6bC6LofV9KrlVs5KHfc8sfj35FBQga7I4ifwFOM15X2R3tVZXr8mNBWJyKu996qkOY3VVMVK3WJe3YLnNMHcTQVbS0lrbW530OZ90uSdLHT8Cv707x_IJ3-HzVIN9e09accHf-BreNnhUnIQht_Ahq3ewrNwUuXNO5gfEN0dikBCtwupHXG4VJKypWNa0p0_cRn5xGiIvL6s8YqrckEQF5PZ3O8IeY41aaUhphVBXNoyc29I6D5ekGZWV6RsAjvgPZwfH50djqPuuIZIM5otI5lnLhOCGpV4EMYznXqip0qwRkpkYhFruczmUo6y1NhcaJ47a9JcjvLUOGPoB9is6sruAEHMorG0G0krJeM0lUwLmgnrcD3ijukBfO0tVsyCKkcRdtNpgSYugokHsI_G_O-Evd7ORRe6i8LvBH_POULBAXxZDWPQ-Z0UWdm6wTkpvpTA9QxvsR3cYvWY3qsGINYcZjXBC3qvj6AbtMLendk_PvrKz_BifDY5KU5-nP7aha3U_xvwIqB8DzaX88Z-QgC1VPttuPwDdawc8g
linkToPdf http://utb.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwnV1LT9tAEB61VEVcSl-0aSndqr06jr1rb_eIoBF9gDgUCfVi7ROi4oeS-AC_vrNeO0o49MDZ68TWzOx84_3mG4AvzGJSdRhIlFoXMd_bI0yiopxjbWCMc5nzjcKnZ_nJBftxmV2ujfrqSPtazcbVTTmuZtcdt7IpdTzwxOLz0yNOOboiixvj4sfwBGM25UOh3tdaXsMmNBbxyCu-D8qkgsZqpmKl7jC37sA2TTB_U8420tJGq9sa4rxPnFzLRNNd-DO8QyCg_B23SzXWd_fkHR_0ks_hWY9PyWFY8gIe2eolPA0TK29fwfyQ6H44AgldL6R2xOGWScqOlmlJP4fiKvIJ0hB5c1XjHdflgiA-Js3cnwx5rjXpJCJmFUF82jF0b0noQl6QtqkrUraBJfAaLqbffh-dRP3Yhkgzmi8jKXKXc06NSjwYy3KdesKnSrBWSmRiEXO53AopJ3lqrOA6E86aVMiJSI0zhu7BVlVX9i0QxC4aS7yJtFKyjKaSaU5zbh3uS5ljegSfB6sVTVDnKMKpOi3QzEUw8wgO0KD_XbA_2LroQ3hR-BPhryJDSDiCT6vLGHz-REVWtm5xTYoPxXFfw594E1xj9TeDZ42AbzjNaoEX9t68gq7QCXz3pn_34Ds_wvb58bT49f3s53vYSf0nAq8Fmu3D1nLe2g-Io5bqoIuYf00oH3I
openUrl ctx_ver=Z39.88-2004&ctx_enc=info%3Aofi%2Fenc%3AUTF-8&rfr_id=info%3Asid%2Fsummon.serialssolutions.com&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Ajournal&rft.genre=article&rft.atitle=A+critical+review+of+five+machine+learning-based+algorithms+for+predicting+protein+stability+changes+upon+mutation&rft.jtitle=Briefings+in+bioinformatics&rft.au=Fang%2C+Jianwen&rft.date=2020-07-15&rft.eissn=1477-4054&rft_id=info:doi/10.1093%2Fbib%2Fbbz071&rft_id=info%3Apmid%2F31273374&rft.externalDocID=31273374
thumbnail_l http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/lc.gif&issn=1477-4054&client=summon
thumbnail_m http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/mc.gif&issn=1477-4054&client=summon
thumbnail_s http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/sc.gif&issn=1477-4054&client=summon