A critical review of five machine learning-based algorithms for predicting protein stability changes upon mutation
Abstract A number of machine learning (ML)-based algorithms have been proposed for predicting mutation-induced stability changes in proteins. In this critical review, we used hypothetical reverse mutations to evaluate the performance of five representative algorithms and found all of them suffer fro...
Saved in:
Published in | Briefings in bioinformatics Vol. 21; no. 4; pp. 1285 - 1292 |
---|---|
Main Author | |
Format | Journal Article |
Language | English |
Published |
England
Oxford University Press
15.07.2020
Oxford Publishing Limited (England) |
Subjects | |
Online Access | Get full text |
Cover
Loading…
Abstract | Abstract
A number of machine learning (ML)-based algorithms have been proposed for predicting mutation-induced stability changes in proteins. In this critical review, we used hypothetical reverse mutations to evaluate the performance of five representative algorithms and found all of them suffer from the problem of overfitting. This approach is based on the fact that if a wild-type protein is more stable than a mutant protein, then the same mutant is less stable than the wild-type protein. We analyzed the underlying issues and suggest that the main causes of the overfitting problem include that the numbers of training cases were too small, and the features used in the models were not sufficiently informative for the task. We make recommendations on how to avoid overfitting in this important research area and improve the reliability and robustness of ML-based algorithms in general. |
---|---|
AbstractList | Abstract A number of machine learning (ML)-based algorithms have been proposed for predicting mutation-induced stability changes in proteins. In this critical review, we used hypothetical reverse mutations to evaluate the performance of five representative algorithms and found all of them suffer from the problem of overfitting. This approach is based on the fact that if a wild-type protein is more stable than a mutant protein, then the same mutant is less stable than the wild-type protein. We analyzed the underlying issues and suggest that the main causes of the overfitting problem include that the numbers of training cases were too small, and the features used in the models were not sufficiently informative for the task. We make recommendations on how to avoid overfitting in this important research area and improve the reliability and robustness of ML-based algorithms in general. A number of machine learning (ML)-based algorithms have been proposed for predicting mutation-induced stability changes in proteins. In this critical review, we used hypothetical reverse mutations to evaluate the performance of five representative algorithms and found all of them suffer from the problem of overfitting. This approach is based on the fact that if a wild-type protein is more stable than a mutant protein, then the same mutant is less stable than the wild-type protein. We analyzed the underlying issues and suggest that the main causes of the overfitting problem include that the numbers of training cases were too small, and the features used in the models were not sufficiently informative for the task. We make recommendations on how to avoid overfitting in this important research area and improve the reliability and robustness of ML-based algorithms in general. Abstract A number of machine learning (ML)-based algorithms have been proposed for predicting mutation-induced stability changes in proteins. In this critical review, we used hypothetical reverse mutations to evaluate the performance of five representative algorithms and found all of them suffer from the problem of overfitting. This approach is based on the fact that if a wild-type protein is more stable than a mutant protein, then the same mutant is less stable than the wild-type protein. We analyzed the underlying issues and suggest that the main causes of the overfitting problem include that the numbers of training cases were too small, and the features used in the models were not sufficiently informative for the task. We make recommendations on how to avoid overfitting in this important research area and improve the reliability and robustness of ML-based algorithms in general. |
Author | Fang, Jianwen |
AuthorAffiliation | Computational & Systems Biology Branch , Biometric Research Program, Division of Cancer Treatment and Diagnosis, National Cancer Institute, 9609 Medical Center Drive, Rockville, MD 20850, USA |
AuthorAffiliation_xml | – name: Computational & Systems Biology Branch , Biometric Research Program, Division of Cancer Treatment and Diagnosis, National Cancer Institute, 9609 Medical Center Drive, Rockville, MD 20850, USA |
Author_xml | – sequence: 1 givenname: Jianwen surname: Fang fullname: Fang, Jianwen email: jianwen.fang@nih.gov organization: Computational & Systems Biology Branch, Biometric Research Program, Division of Cancer Treatment and Diagnosis, National Cancer Institute, 9609 Medical Center Drive, Rockville, MD 20850, USA |
BackLink | https://www.ncbi.nlm.nih.gov/pubmed/31273374$$D View this record in MEDLINE/PubMed |
BookMark | eNp9kVtrFTEUhYNU7EVf_AESEKEIY3OZJGdehFLqBQq-6HPIZc85KTPJmGSO1F9v5NRSffBpb1jfXqzNOkVHMUVA6CUl7ygZ-IUN9sLan0TRJ-iE9kp1PRH90aP9GJ2WcksII2pDn6FjTpniXPUnKF9il0MNzkw4wz7AD5xGPIY94Nm4XYiAJzA5hrjtrCngsZm2qV3s5oLHlPGSwQdXm97WVCFEXKqxYQr1DrudiVsoeF1SxPNaTQ0pPkdPRzMVeHE_z9C3D9dfrz51N18-fr66vOlcz2XtzCBHqRT3ltKeCyEdU-1dSyWR1FAQRIwSBmOIZB4G5cQwgmeDIQPzo_f8DL0_-C6rncE7iDWbSS85zCbf6WSC_luJYae3aa8VV5xu-mZwfm-Q0_cVStVzKA6myURIa9GMCc4UEZQ29PU_6G1ac2zvadYLuRkEE7JRbw-Uy6mUDONDGEr07yp1q1Ifqmzwq8fxH9A_3TXgzQFI6_I_o18NdKtK |
CitedBy_id | crossref_primary_10_1021_acscatal_3c02743 crossref_primary_10_1016_j_sbi_2021_11_001 crossref_primary_10_1016_j_cels_2021_05_009 crossref_primary_10_1021_acs_jcim_9b00911 crossref_primary_10_1093_bib_bbad357 crossref_primary_10_7554_eLife_82819 crossref_primary_10_1093_nar_gkaa925 crossref_primary_10_1002_jcc_26810 crossref_primary_10_1016_j_biotechadv_2021_107793 crossref_primary_10_3389_fmolb_2021_663301 crossref_primary_10_1016_j_jmps_2023_105531 crossref_primary_10_1002_jcb_30181 crossref_primary_10_1016_j_celrep_2021_110045 crossref_primary_10_3390_ani13182935 crossref_primary_10_3724_abbs_2023033 crossref_primary_10_1002_pmic_202300371 crossref_primary_10_1021_acs_jcim_0c00725 crossref_primary_10_3390_ijms241512073 crossref_primary_10_1002_bit_27980 crossref_primary_10_1111_tan_14725 crossref_primary_10_3389_fbioe_2021_613322 crossref_primary_10_1007_s12033_021_00349_0 crossref_primary_10_1093_bioinformatics_btad671 crossref_primary_10_3390_ijms22020606 crossref_primary_10_1093_bib_bbaa074 crossref_primary_10_1093_bib_bbac570 crossref_primary_10_1093_bib_bbad065 crossref_primary_10_1128_aem_01878_22 crossref_primary_10_1016_j_compchemeng_2024_108585 crossref_primary_10_3390_genes12060911 crossref_primary_10_3389_fbioe_2021_673005 crossref_primary_10_3390_ijms22105408 crossref_primary_10_1093_bib_bbab555 crossref_primary_10_1016_j_compbiolchem_2023_107952 crossref_primary_10_1016_j_future_2024_06_034 crossref_primary_10_1021_acs_jcim_2c00054 crossref_primary_10_1109_ACCESS_2023_3280422 crossref_primary_10_1093_nar_gkac325 crossref_primary_10_1186_s12859_021_04238_w crossref_primary_10_1080_03461238_2022_2161413 crossref_primary_10_1002_pro_4557 crossref_primary_10_1093_bib_bbad333 crossref_primary_10_3389_fphar_2022_946668 crossref_primary_10_7759_cureus_33592 crossref_primary_10_34133_research_0219 crossref_primary_10_1016_j_csbj_2022_11_009 crossref_primary_10_1021_acs_jpcb_1c04913 crossref_primary_10_1016_j_jmb_2023_168060 crossref_primary_10_1093_bib_bbz168 crossref_primary_10_7554_eLife_82593 crossref_primary_10_3390_cancers15071958 crossref_primary_10_1063_5_0032019 crossref_primary_10_1021_jacs_3c11940 crossref_primary_10_1093_bioinformatics_btad011 crossref_primary_10_1002_pro_4467 crossref_primary_10_1007_s10237_020_01410_8 crossref_primary_10_1002_pro_4861 crossref_primary_10_1021_acs_jcim_0c00064 crossref_primary_10_1093_bib_bbab184 crossref_primary_10_1016_j_compbiomed_2023_107678 crossref_primary_10_1016_j_str_2024_02_016 crossref_primary_10_1016_j_tips_2020_12_004 crossref_primary_10_1021_acs_jcim_2c00799 crossref_primary_10_1016_j_copbio_2023_103047 crossref_primary_10_1080_07391102_2022_2137699 crossref_primary_10_1088_1361_6463_abedfb |
Cites_doi | 10.1093/nar/gku411 10.1093/nar/gky1100 10.1093/nar/gki375 10.1093/bioinformatics/btw361 10.1080/20014091074174 10.1073/pnas.84.19.6663 10.1016/j.csbj.2018.01.002 10.1016/j.eswa.2008.12.020 10.1002/humu.21242 10.1093/bioinformatics/btt691 10.1016/S0958-1669(99)80070-6 10.1371/journal.pone.0203819 10.1073/pnas.0904191106 10.1016/S0022-2836(03)00233-X 10.1126/science.1107387 10.1186/1471-2105-9-S2-S6 10.1021/bi0600143 10.1093/nar/gkj103 10.1073/pnas.0808220106 10.1093/nar/gky300 10.1093/bioinformatics/btt055 10.1002/pro.8 10.1093/bioinformatics/btn166 10.1007/978-1-59745-367-7_11 10.1002/prot.20810 10.1110/ps.0217002 10.1093/nar/gki387 10.1110/ps.051454805 10.1016/S0959-440X(03)00104-0 10.2174/138920210791616725 10.1002/prot.23163 10.1158/0008-5472.CAN-14-3812 10.3390/ijms19041009 10.1093/bioinformatics/btm345 10.1074/jbc.M501675200 10.1111/j.1742-4658.2007.05954.x 10.1126/science.1079237 10.1021/acs.jpcb.8b08990 10.1002/bip.360261104 10.1073/pnas.86.21.8382 10.1002/jcc.20289 10.1016/S0301-4622(99)00103-9 10.1073/pnas.86.17.6562 10.1093/bioinformatics/btp370 10.1093/bioinformatics/btn353 10.1002/pro.574 10.1002/pro.344 10.1371/journal.pone.0046084 |
ContentType | Journal Article |
Copyright | The Author(s) 2019. Published by Oxford University Press. 2019 The Author(s) 2019. Published by Oxford University Press. |
Copyright_xml | – notice: The Author(s) 2019. Published by Oxford University Press. 2019 – notice: The Author(s) 2019. Published by Oxford University Press. |
DBID | NPM AAYXX CITATION 7QO 7SC 8FD FR3 JQ2 K9. L7M L~C L~D P64 RC3 7X8 5PM |
DOI | 10.1093/bib/bbz071 |
DatabaseName | PubMed CrossRef Biotechnology Research Abstracts Computer and Information Systems Abstracts Technology Research Database Engineering Research Database ProQuest Computer Science Collection ProQuest Health & Medical Complete (Alumni) Advanced Technologies Database with Aerospace Computer and Information Systems Abstracts Academic Computer and Information Systems Abstracts Professional Biotechnology and BioEngineering Abstracts Genetics Abstracts MEDLINE - Academic PubMed Central (Full Participant titles) |
DatabaseTitle | PubMed CrossRef Genetics Abstracts Biotechnology Research Abstracts Technology Research Database Computer and Information Systems Abstracts – Academic ProQuest Computer Science Collection Computer and Information Systems Abstracts ProQuest Health & Medical Complete (Alumni) Engineering Research Database Advanced Technologies Database with Aerospace Biotechnology and BioEngineering Abstracts Computer and Information Systems Abstracts Professional MEDLINE - Academic |
DatabaseTitleList | CrossRef PubMed Genetics Abstracts |
Database_xml | – sequence: 1 dbid: NPM name: PubMed url: https://proxy.k.utb.cz/login?url=http://www.ncbi.nlm.nih.gov/entrez/query.fcgi?db=PubMed sourceTypes: Index Database |
DeliveryMethod | fulltext_linktorsrc |
Discipline | Biology |
EISSN | 1477-4054 |
EndPage | 1292 |
ExternalDocumentID | 10_1093_bib_bbz071 31273374 10.1093/bib/bbz071 |
Genre | Journal Article |
GrantInformation_xml | – fundername: ; ; |
GroupedDBID | --- -E4 .2P .I3 0R~ 1TH 23N 2WC 36B 4.4 48X 53G 5GY 5VS 6J9 70D 8VB AAHBH AAIJN AAIMJ AAJKP AAJQQ AAMDB AAMVS AAOGV AAPQZ AAPXW AARHZ AASNB AAUQX AAVAP AAVLN ABDBF ABEUO ABIXL ABJNI ABNKS ABPTD ABQLI ABQTQ ABWST ABXVV ABZBJ ACGFO ACGFS ACGOD ACIWK ACPRK ACUFI ACYTK ADBBV ADEYI ADFTL ADGKP ADGZP ADHKW ADHZD ADOCK ADPDF ADQBN ADRDM ADRIX ADRTK ADVEK ADYVW ADZTZ ADZXQ AECKG AEGPL AEGXH AEJOX AEKKA AEKSI AELWJ AEMDU AEMOZ AENEX AENZO AEPUE AETBJ AEWNT AFFZL AFGWE AFIYH AFOFC AFRAH AFXEN AGINJ AGKEF AGQXC AGSYK AHMBA AHXPO AIAGR AIJHB AJEEA AJEUX AKHUL AKVCP AKWXX ALMA_UNASSIGNED_HOLDINGS ALTZX ALUQC APIBT APWMN ARIXL AXUDD AYOIW AZVOD BAWUL BAYMD BCRHZ BEYMZ BHONS BQDIO BQUQU BSWAC BTQHN C1A C45 CAG CDBKE COF CS3 CZ4 DAKXR DIK DILTD DU5 D~K E3Z EAD EAP EAS EBA EBC EBD EBR EBS EBU EE~ EJD EMB EMK EMOBN EST ESX F5P F9B FHSFR FLIZI FLUFQ FOEOM FQBLK GAUVT GJXCC GX1 H13 H5~ HAR HW0 HZ~ IOX J21 K1G KBUDW KOP KSI KSN M-Z M49 MK~ ML0 N9A NGC NLBLG NMDNZ NOMLY NU- O0~ O9- OAWHX ODMLO OJQWA OK1 OVD OVEED P2P PAFKI PEELM PQQKQ Q1. Q5Y QWB RD5 ROX RPM RUSNO RW1 RXO SV3 TEORI TH9 TJP TLC TOX TR2 TUS W8F WOQ X7H YAYTL YKOAZ YXANX ZKX ZL0 ~91 NPM AAYXX CITATION 7QO 7SC 8FD FR3 JQ2 K9. L7M L~C L~D P64 RC3 7X8 5PM |
ID | FETCH-LOGICAL-c436t-a96f6773db1143556c27109b16061a1e505f6e9aa062de97c59fed29a092dfdd3 |
IEDL.DBID | RPM |
ISSN | 1477-4054 1467-5463 |
IngestDate | Tue Sep 17 21:09:28 EDT 2024 Fri Aug 16 11:46:11 EDT 2024 Thu Oct 10 19:25:01 EDT 2024 Fri Aug 23 05:12:39 EDT 2024 Wed Oct 16 00:47:59 EDT 2024 Wed Aug 28 03:17:31 EDT 2024 |
IsDoiOpenAccess | false |
IsOpenAccess | true |
IsPeerReviewed | true |
IsScholarly | true |
Issue | 4 |
Keywords | robustness mutation protein stability reverse mutation computational prediction reliability |
Language | English |
License | This work is written by US Government employees and is in the public domain in the US. The Author(s) 2019. Published by Oxford University Press. |
LinkModel | DirectLink |
MergedId | FETCHMERGED-LOGICAL-c436t-a96f6773db1143556c27109b16061a1e505f6e9aa062de97c59fed29a092dfdd3 |
Notes | ObjectType-Article-2 SourceType-Scholarly Journals-1 ObjectType-Feature-3 content type line 23 ObjectType-Review-1 |
OpenAccessLink | https://academic.oup.com/bib/article-pdf/21/4/1285/33584078/bbz071.pdf |
PMID | 31273374 |
PQID | 2456895256 |
PQPubID | 26846 |
PageCount | 8 |
ParticipantIDs | pubmedcentral_primary_oai_pubmedcentral_nih_gov_7373184 proquest_miscellaneous_2253270511 proquest_journals_2456895256 crossref_primary_10_1093_bib_bbz071 pubmed_primary_31273374 oup_primary_10_1093_bib_bbz071 |
PublicationCentury | 2000 |
PublicationDate | 20200715 |
PublicationDateYYYYMMDD | 2020-07-15 |
PublicationDate_xml | – month: 07 year: 2020 text: 20200715 day: 15 |
PublicationDecade | 2020 |
PublicationPlace | England |
PublicationPlace_xml | – name: England – name: Oxford |
PublicationTitle | Briefings in bioinformatics |
PublicationTitleAlternate | Brief Bioinform |
PublicationYear | 2020 |
Publisher | Oxford University Press Oxford Publishing Limited (England) |
Publisher_xml | – name: Oxford University Press – name: Oxford Publishing Limited (England) |
References | Wikipedia (2020080709263705600_ref39) Pires (2020080709263705600_ref37) 2014; 42 Rodrigues (2020080709263705600_ref52) 2018; 46 Agoston (2020080709263705600_ref10) 2005; 280 Mitchell (2020080709263705600_ref43) 2019; 47 Capriotti (2020080709263705600_ref24) 2008; 9 Dahiyat (2020080709263705600_ref1) 1999; 10 Masso (2020080709263705600_ref18) 2008; 24 Bagowski (2020080709263705600_ref44) 2010; 11 Pires (2020080709263705600_ref36) 2014; 30 Gribenko (2020080709263705600_ref45) 2009; 106 Li (2020080709263705600_ref9) 2016; 76 Cheng (2020080709263705600_ref17) 2006; 62 Schymkowitz (2020080709263705600_ref13) 2005; 33 Glyakina (2020080709263705600_ref23) 2007; 23 Day (2020080709263705600_ref49) McGuinness (2020080709263705600_ref31) 2018; 13 Montanucci (2020080709263705600_ref19) 2008; 24 Pronk (2020080709263705600_ref54) 2013; 29 Thiltgen (2020080709263705600_ref29) 2012; 7 Korkegian (2020080709263705600_ref2) 2005; 308 Unsworth (2020080709263705600_ref7) 2007; 274 Khan (2020080709263705600_ref30) 2010; 31 Kumar (2020080709263705600_ref40) 2006; 34 Baase (2020080709263705600_ref12) 2010; 19 Schweiker (2020080709263705600_ref4) 2009 Matthews (2020080709263705600_ref25) 1987; 84 Becktel (2020080709263705600_ref38) 1987; 26 Huang (2020080709263705600_ref22) 2009; 25 Sterner (2020080709263705600_ref5) 2001; 36 Quan (2020080709263705600_ref35) 2016; 32 Sheffler (2020080709263705600_ref14) 2009; 18 Li (2020080709263705600_ref32) 2012; 80 Li (2020080709263705600_ref33) 2012; 7 Vapnik (2020080709263705600_ref41) 1998 Fang (2020080709263705600_ref34) 2015; 4 Capriotti (2020080709263705600_ref16) 2005; 33 Matsumura (2020080709263705600_ref27) 1989; 86 Lazar (2020080709263705600_ref3) 2003; 13 Schoemaker (2020080709263705600_ref8) 2003; 299 Gromiha (2020080709263705600_ref21) 1999; 82 Makhatadze (2020080709263705600_ref26) 2003; 327 Gong (2020080709263705600_ref51) 2011; 20 Buss (2020080709263705600_ref28) 2018; 16 Wu (2020080709263705600_ref20) 2009; 36 Sakamoto (2020080709263705600_ref11) 2009 Yang (2020080709263705600_ref42) 2018; 19 Phillips (2020080709263705600_ref53) 2005; 26 Strickler (2020080709263705600_ref46) 2006; 45 Chennamsetty (2020080709263705600_ref6) 2009; 106 Spolar (2020080709263705600_ref48) 1989; 86 Bruno da Silva (2020080709263705600_ref47) 2018; 122 Fleming (2020080709263705600_ref50) 2005; 14 Zhou (2020080709263705600_ref15) 2002; 11 |
References_xml | – volume: 42 start-page: W314 year: 2014 ident: 2020080709263705600_ref37 article-title: DUET: a server for predicting effects of mutations on protein stability using an integrated computational approach publication-title: Nucleic Acids Res doi: 10.1093/nar/gku411 contributor: fullname: Pires – volume: 47 start-page: D351 year: 2019 ident: 2020080709263705600_ref43 article-title: InterPro in 2019: improving coverage, classification and access to protein sequence annotations publication-title: Nucleic Acids Res doi: 10.1093/nar/gky1100 contributor: fullname: Mitchell – volume: 33 start-page: W306 year: 2005 ident: 2020080709263705600_ref16 article-title: I-Mutant2.0: predicting stability changes upon mutation from the protein sequence or structure publication-title: Nucleic Acids Res doi: 10.1093/nar/gki375 contributor: fullname: Capriotti – volume: 32 start-page: 2936 year: 2016 ident: 2020080709263705600_ref35 article-title: STRUM: structure-based prediction of protein stability changes upon single-point mutation publication-title: Bioinformatics doi: 10.1093/bioinformatics/btw361 contributor: fullname: Quan – volume: 36 start-page: 39 year: 2001 ident: 2020080709263705600_ref5 article-title: Thermophilic adaptation of proteins publication-title: Crit Rev Biochem Mol Biol doi: 10.1080/20014091074174 contributor: fullname: Sterner – volume: 84 start-page: 6663 year: 1987 ident: 2020080709263705600_ref25 article-title: Enhanced protein thermostability from site-directed mutations that decrease the entropy of unfolding publication-title: Proc Natl Acad Sci USA doi: 10.1073/pnas.84.19.6663 contributor: fullname: Matthews – volume: 16 start-page: 25 year: 2018 ident: 2020080709263705600_ref28 article-title: FoldX as protein engineering tool: better than random based approaches? publication-title: Comput Struct Biotechnol J doi: 10.1016/j.csbj.2018.01.002 contributor: fullname: Buss – volume: 36 start-page: 9007 year: 2009 ident: 2020080709263705600_ref20 article-title: An expert system to predict protein thermostability using decision tree publication-title: Expert Systems with Applications doi: 10.1016/j.eswa.2008.12.020 contributor: fullname: Wu – volume: 31 start-page: 675 year: 2010 ident: 2020080709263705600_ref30 article-title: Performance of protein stability predictors publication-title: Hum Mutat doi: 10.1002/humu.21242 contributor: fullname: Khan – volume: 30 start-page: 335 year: 2014 ident: 2020080709263705600_ref36 article-title: mCSM: predicting the effects of mutations in proteins using graph-based signatures publication-title: Bioinformatics doi: 10.1093/bioinformatics/btt691 contributor: fullname: Pires – volume-title: Statistical Learning Theory year: 1998 ident: 2020080709263705600_ref41 contributor: fullname: Vapnik – volume: 10 start-page: 387 year: 1999 ident: 2020080709263705600_ref1 article-title: In silico design for protein stabilization publication-title: Curr Opin Biotechnol doi: 10.1016/S0958-1669(99)80070-6 contributor: fullname: Dahiyat – volume: 13 year: 2018 ident: 2020080709263705600_ref31 article-title: Role of simple descriptors and applicability domain in predicting change in protein thermostability publication-title: PLoS One doi: 10.1371/journal.pone.0203819 contributor: fullname: McGuinness – volume: 106 start-page: 11937 year: 2009 ident: 2020080709263705600_ref6 article-title: Design of therapeutic proteins with enhanced stability publication-title: Proc Natl Acad Sci USA doi: 10.1073/pnas.0904191106 contributor: fullname: Chennamsetty – volume: 327 start-page: 1135 year: 2003 ident: 2020080709263705600_ref26 article-title: Contribution of surface salt bridges to protein stability: guidelines for protein engineering publication-title: J Mol Biol doi: 10.1016/S0022-2836(03)00233-X contributor: fullname: Makhatadze – volume: 308 start-page: 857 year: 2005 ident: 2020080709263705600_ref2 article-title: Computational thermostabilization of an enzyme publication-title: Science doi: 10.1126/science.1107387 contributor: fullname: Korkegian – volume: 9 start-page: S6 issue: Suppl 2 year: 2008 ident: 2020080709263705600_ref24 article-title: A three-state prediction of single point mutations on protein stability changes publication-title: BMC Bioinformatics doi: 10.1186/1471-2105-9-S2-S6 contributor: fullname: Capriotti – volume: 4 start-page: e130 year: 2015 ident: 2020080709263705600_ref34 article-title: Reliability of machine learning based algorithms for designing protein drugs with enhanced stability publication-title: Drug Designing: Open Access contributor: fullname: Fang – volume: 45 start-page: 2761 year: 2006 ident: 2020080709263705600_ref46 article-title: Protein stability and surface electrostatics: a charged relationship publication-title: Biochemistry doi: 10.1021/bi0600143 contributor: fullname: Strickler – volume: 34 start-page: D204 year: 2006 ident: 2020080709263705600_ref40 article-title: ProTherm and ProNIT: thermodynamic databases for proteins and protein-nucleic acid interactions publication-title: Nucleic Acids Res doi: 10.1093/nar/gkj103 contributor: fullname: Kumar – ident: 2020080709263705600_ref39 contributor: fullname: Wikipedia – volume: 106 start-page: 2601 year: 2009 ident: 2020080709263705600_ref45 article-title: Rational stabilization of enzymes by computational redesign of surface charge-charge interactions publication-title: Proc Natl Acad Sci USA doi: 10.1073/pnas.0808220106 contributor: fullname: Gribenko – volume: 46 start-page: W350 year: 2018 ident: 2020080709263705600_ref52 article-title: DynaMut: predicting the impact of mutations on protein conformation, flexibility and stability publication-title: Nucleic Acids Res doi: 10.1093/nar/gky300 contributor: fullname: Rodrigues – volume: 29 start-page: 845 year: 2013 ident: 2020080709263705600_ref54 article-title: GROMACS 4.5: a high-throughput and highly parallel open source molecular simulation toolkit publication-title: Bioinformatics doi: 10.1093/bioinformatics/btt055 contributor: fullname: Pronk – volume: 18 start-page: 229 year: 2009 ident: 2020080709263705600_ref14 article-title: RosettaHoles: rapid assessment of protein core packing for structure prediction, refinement, design, and validation publication-title: Protein Sci doi: 10.1002/pro.8 contributor: fullname: Sheffler – volume: 24 start-page: I190 year: 2008 ident: 2020080709263705600_ref19 article-title: Predicting protein thermostability changes from sequence upon multiple mutations publication-title: Bioinformatics doi: 10.1093/bioinformatics/btn166 contributor: fullname: Montanucci – start-page: 261 volume-title: Protein Structure, Stability, and Interactions year: 2009 ident: 2020080709263705600_ref4 doi: 10.1007/978-1-59745-367-7_11 contributor: fullname: Schweiker – volume: 62 start-page: 1125 year: 2006 ident: 2020080709263705600_ref17 article-title: Prediction of protein stability changes for single-site mutations using support vector machines publication-title: Proteins doi: 10.1002/prot.20810 contributor: fullname: Cheng – volume: 11 start-page: 2714 year: 2002 ident: 2020080709263705600_ref15 article-title: Distance-scaled, finite ideal-gas reference state improves structure-derived potentials of mean force for structure selection and stability prediction publication-title: Protein Sci doi: 10.1110/ps.0217002 contributor: fullname: Zhou – volume: 33 start-page: W382 year: 2005 ident: 2020080709263705600_ref13 article-title: The FoldX web server: an online force field publication-title: Nucleic Acids Res doi: 10.1093/nar/gki387 contributor: fullname: Schymkowitz – volume: 14 start-page: 1911 year: 2005 ident: 2020080709263705600_ref50 article-title: Do all backbone polar groups in proteins form hydrogen bonds? publication-title: Protein Sci doi: 10.1110/ps.051454805 contributor: fullname: Fleming – volume: 13 start-page: 513 year: 2003 ident: 2020080709263705600_ref3 article-title: Designing proteins for therapeutic applications publication-title: Curr Opin Struct Biol doi: 10.1016/S0959-440X(03)00104-0 contributor: fullname: Lazar – volume: 11 start-page: 368 year: 2010 ident: 2020080709263705600_ref44 article-title: The nature of protein domain evolution: shaping the interaction network publication-title: Curr Genomics doi: 10.2174/138920210791616725 contributor: fullname: Bagowski – volume: 80 start-page: 81 year: 2012 ident: 2020080709263705600_ref32 article-title: Prots: a fragment based protein thermo-stability potential publication-title: Proteins doi: 10.1002/prot.23163 contributor: fullname: Li – volume: 76 start-page: 561 year: 2016 ident: 2020080709263705600_ref9 article-title: Balancing protein stability and activity in cancer: a new approach for identifying driver mutations affecting CBL ubiquitin ligase activation publication-title: Cancer Res doi: 10.1158/0008-5472.CAN-14-3812 contributor: fullname: Li – volume: 7 year: 2012 ident: 2020080709263705600_ref33 article-title: PROTS-RF: a robust model for predicting mutation-induced protein stability changes publication-title: PLoS One contributor: fullname: Li – volume: 19 start-page: 1009 year: 2018 ident: 2020080709263705600_ref42 article-title: PON-tstab: protein variant stability predictor. importance of training data quality publication-title: Int J Mol Sci doi: 10.3390/ijms19041009 contributor: fullname: Yang – volume: 23 start-page: 2231 year: 2007 ident: 2020080709263705600_ref23 article-title: Different packing of external residues can explain differences in the thermostability of proteins from thermophilic and mesophilic organisms publication-title: Bioinformatics doi: 10.1093/bioinformatics/btm345 contributor: fullname: Glyakina – volume: 280 start-page: 18302 year: 2005 ident: 2020080709263705600_ref10 article-title: Increased protein stability causes DNA methyltransferase 1 dysregulation in breast cancer publication-title: J Biol Chem doi: 10.1074/jbc.M501675200 contributor: fullname: Agoston – volume: 274 start-page: 4044 year: 2007 ident: 2020080709263705600_ref7 article-title: Hyperthermophilic enzymes—stability, activity and implementation strategies for high temperature applications publication-title: FEBS J doi: 10.1111/j.1742-4658.2007.05954.x contributor: fullname: Unsworth – volume: 299 start-page: 1694 year: 2003 ident: 2020080709263705600_ref8 article-title: Dispelling the myths—biocatalysis in industrial synthesis publication-title: Science doi: 10.1126/science.1079237 contributor: fullname: Schoemaker – ident: 2020080709263705600_ref49 contributor: fullname: Day – volume: 122 start-page: 10817 issue: 48 year: 2018 ident: 2020080709263705600_ref47 article-title: Non-native cooperative interactions modulate protein folding rates publication-title: J Phys Chem B doi: 10.1021/acs.jpcb.8b08990 contributor: fullname: Bruno da Silva – volume-title: Modulation of Protein Stability in Cancer Therapy year: 2009 ident: 2020080709263705600_ref11 contributor: fullname: Sakamoto – volume: 26 start-page: 1859 year: 1987 ident: 2020080709263705600_ref38 article-title: Protein stability curves publication-title: Biopolymers doi: 10.1002/bip.360261104 contributor: fullname: Becktel – volume: 86 start-page: 8382 year: 1989 ident: 2020080709263705600_ref48 article-title: Hydrophobic effect in protein folding and other noncovalent processes involving proteins publication-title: Proc Natl Acad Sci USA doi: 10.1073/pnas.86.21.8382 contributor: fullname: Spolar – volume: 26 start-page: 1781 year: 2005 ident: 2020080709263705600_ref53 article-title: Scalable molecular dynamics with NAMD publication-title: J Comput Chem doi: 10.1002/jcc.20289 contributor: fullname: Phillips – volume: 82 start-page: 51 year: 1999 ident: 2020080709263705600_ref21 article-title: Important amino acid properties for enhanced thermostability from mesophilic to thermophilic proteins publication-title: Biophys Chem doi: 10.1016/S0301-4622(99)00103-9 contributor: fullname: Gromiha – volume: 86 start-page: 6562 year: 1989 ident: 2020080709263705600_ref27 article-title: Stabilization of phage-T4 lysozyme by engineered disulfide bonds publication-title: Proc Natl Acad Sci USA doi: 10.1073/pnas.86.17.6562 contributor: fullname: Matsumura – volume: 25 start-page: 2181 year: 2009 ident: 2020080709263705600_ref22 article-title: Reliable prediction of protein thermostability change upon double mutation from amino acid sequence publication-title: Bioinformatics doi: 10.1093/bioinformatics/btp370 contributor: fullname: Huang – volume: 24 start-page: 2002 year: 2008 ident: 2020080709263705600_ref18 article-title: Accurate prediction of stability changes in protein mutants by combining machine learning with structure based computational mutagenesis publication-title: Bioinformatics doi: 10.1093/bioinformatics/btn353 contributor: fullname: Masso – volume: 20 start-page: 417 year: 2011 ident: 2020080709263705600_ref51 article-title: Counting peptide-water hydrogen bonds in unfolded proteins publication-title: Protein Sci doi: 10.1002/pro.574 contributor: fullname: Gong – volume: 19 start-page: 631 year: 2010 ident: 2020080709263705600_ref12 article-title: Lessons from the lysozyme of phage T4 publication-title: Protein Sci doi: 10.1002/pro.344 contributor: fullname: Baase – volume: 7 year: 2012 ident: 2020080709263705600_ref29 article-title: Assessing predictors of changes in protein stability upon mutation using self-consistency publication-title: PLoS One doi: 10.1371/journal.pone.0046084 contributor: fullname: Thiltgen |
SSID | ssj0020781 |
Score | 2.5820107 |
SecondaryResourceType | review_article |
Snippet | Abstract
A number of machine learning (ML)-based algorithms have been proposed for predicting mutation-induced stability changes in proteins. In this critical... A number of machine learning (ML)-based algorithms have been proposed for predicting mutation-induced stability changes in proteins. In this critical review,... Abstract A number of machine learning (ML)-based algorithms have been proposed for predicting mutation-induced stability changes in proteins. In this critical... |
SourceID | pubmedcentral proquest crossref pubmed oup |
SourceType | Open Access Repository Aggregation Database Index Database Publisher |
StartPage | 1285 |
SubjectTerms | Algorithms Learning algorithms Machine learning Mutants Mutation Proteins Review Stability |
Title | A critical review of five machine learning-based algorithms for predicting protein stability changes upon mutation |
URI | https://www.ncbi.nlm.nih.gov/pubmed/31273374 https://www.proquest.com/docview/2456895256 https://search.proquest.com/docview/2253270511 https://pubmed.ncbi.nlm.nih.gov/PMC7373184 |
Volume | 21 |
hasFullText | 1 |
inHoldings | 1 |
isFullTextHit | |
isPrint | |
link | http://utb.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwnV1Lb9QwEB61RaBeEG8WysoIrmk28Qsft4VVhVTg0Ep7i_xsV2qS1T4O5dczjpNVlwMHLrnYTiLP2PON5psZgM_Mo1ENeJAo9SFjMbdHucJkQqJv4FwIPMRE4csf4uKafZ_z-QHwIRemI-1bszht7urTZnHbcSuXtc0Hnlj-6_JcUomqyPJDOJSUDi5672XF6jUppUhmsdb7UJNU0dwsTG7Mb7Sqx_CEFmi5qWR7Bmkvye0B1vybMvnABs2ewdMePJJp-snncOCbF_A4tZO8fwmrKbF95wKSUlJIG0jA-4zUHWfSk75JxE0WrZcj-u6mxRW39ZogeCXLVQzbRCI06eo3LBqC4LGjz96TlCK8Jttl25B6m0L4r-B69u3q_CLreypkllGxybQSQUhJnSkiUuLClpGNaQp0ZApdeAREQXil9USUzitpuQrelUpPVOmCc_Q1HDVt498CQWBh0f-aaK8147TUzEoqpA94afDA7Ag-DRtbLVPpjCqFvGmFkqiSJEYwxj3_54STQRxVf77WVQzXflEc8doIPu6G8WTEcIdufLvFOSX-lMRLB1_xJklv95lB-COQe3LdTYhVt_dHUBm76tu98r3775Xv4biMTnuszslP4Giz2voPiGw2ZgyPpmdfz2bjTqPxefVz_gfM0P16 |
link.rule.ids | 230,315,733,786,790,891,1611,27955,27956,53825,53827 |
linkProvider | National Library of Medicine |
linkToHtml | http://utb.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwnV1LT9wwEB5RqrZc-i5sodRVe81mE9txc0QItG1Z1ANU3CI_YVWSrHY3B_j1HcfJiuVQCc52XpoZzzfxN58BvjGLSdVhIFFqXcR8b09uEhVlAmsDY5zjzjcKT06z8Tn7ecEvNoD3vTAtaV-r6bC6LofV9KrlVs5KHfc8sfj35FBQga7I4ifwFOM15X2R3tVZXr8mNBWJyKu996qkOY3VVMVK3WJe3YLnNMHcTQVbS0lrbW530OZ90uSdLHT8Cv707x_IJ3-HzVIN9e09accHf-BreNnhUnIQht_Ahq3ewrNwUuXNO5gfEN0dikBCtwupHXG4VJKypWNa0p0_cRn5xGiIvL6s8YqrckEQF5PZ3O8IeY41aaUhphVBXNoyc29I6D5ekGZWV6RsAjvgPZwfH50djqPuuIZIM5otI5lnLhOCGpV4EMYznXqip0qwRkpkYhFruczmUo6y1NhcaJ47a9JcjvLUOGPoB9is6sruAEHMorG0G0krJeM0lUwLmgnrcD3ijukBfO0tVsyCKkcRdtNpgSYugokHsI_G_O-Evd7ORRe6i8LvBH_POULBAXxZDWPQ-Z0UWdm6wTkpvpTA9QxvsR3cYvWY3qsGINYcZjXBC3qvj6AbtMLendk_PvrKz_BifDY5KU5-nP7aha3U_xvwIqB8DzaX88Z-QgC1VPttuPwDdawc8g |
linkToPdf | http://utb.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwnV1LT9tAEB61VEVcSl-0aSndqr06jr1rb_eIoBF9gDgUCfVi7ROi4oeS-AC_vrNeO0o49MDZ68TWzOx84_3mG4AvzGJSdRhIlFoXMd_bI0yiopxjbWCMc5nzjcKnZ_nJBftxmV2ujfrqSPtazcbVTTmuZtcdt7IpdTzwxOLz0yNOOboiixvj4sfwBGM25UOh3tdaXsMmNBbxyCu-D8qkgsZqpmKl7jC37sA2TTB_U8420tJGq9sa4rxPnFzLRNNd-DO8QyCg_B23SzXWd_fkHR_0ks_hWY9PyWFY8gIe2eolPA0TK29fwfyQ6H44AgldL6R2xOGWScqOlmlJP4fiKvIJ0hB5c1XjHdflgiA-Js3cnwx5rjXpJCJmFUF82jF0b0noQl6QtqkrUraBJfAaLqbffh-dRP3Yhkgzmi8jKXKXc06NSjwYy3KdesKnSrBWSmRiEXO53AopJ3lqrOA6E86aVMiJSI0zhu7BVlVX9i0QxC4aS7yJtFKyjKaSaU5zbh3uS5ljegSfB6sVTVDnKMKpOi3QzEUw8wgO0KD_XbA_2LroQ3hR-BPhryJDSDiCT6vLGHz-REVWtm5xTYoPxXFfw594E1xj9TeDZ42AbzjNaoEX9t68gq7QCXz3pn_34Ds_wvb58bT49f3s53vYSf0nAq8Fmu3D1nLe2g-Io5bqoIuYf00oH3I |
openUrl | ctx_ver=Z39.88-2004&ctx_enc=info%3Aofi%2Fenc%3AUTF-8&rfr_id=info%3Asid%2Fsummon.serialssolutions.com&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Ajournal&rft.genre=article&rft.atitle=A+critical+review+of+five+machine+learning-based+algorithms+for+predicting+protein+stability+changes+upon+mutation&rft.jtitle=Briefings+in+bioinformatics&rft.au=Fang%2C+Jianwen&rft.date=2020-07-15&rft.eissn=1477-4054&rft_id=info:doi/10.1093%2Fbib%2Fbbz071&rft_id=info%3Apmid%2F31273374&rft.externalDocID=31273374 |
thumbnail_l | http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/lc.gif&issn=1477-4054&client=summon |
thumbnail_m | http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/mc.gif&issn=1477-4054&client=summon |
thumbnail_s | http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/sc.gif&issn=1477-4054&client=summon |