A critical review of five machine learning-based algorithms for predicting protein stability changes upon mutation

Abstract A number of machine learning (ML)-based algorithms have been proposed for predicting mutation-induced stability changes in proteins. In this critical review, we used hypothetical reverse mutations to evaluate the performance of five representative algorithms and found all of them suffer fro...

Full description

Saved in:

Bibliographic Details
Published in	Briefings in bioinformatics Vol. 21; no. 4; pp. 1285 - 1292
Main Author	Fang, Jianwen
Format	Journal Article
Language	English
Published	England Oxford University Press 15.07.2020 Oxford Publishing Limited (England)
Subjects	Algorithms Learning algorithms Machine learning Mutants Mutation Proteins Review Stability robustness mutation protein stability reverse mutation computational prediction reliability
Online Access	Get full text

Cover

Loading…

More Information
Summary:	Abstract A number of machine learning (ML)-based algorithms have been proposed for predicting mutation-induced stability changes in proteins. In this critical review, we used hypothetical reverse mutations to evaluate the performance of five representative algorithms and found all of them suffer from the problem of overfitting. This approach is based on the fact that if a wild-type protein is more stable than a mutant protein, then the same mutant is less stable than the wild-type protein. We analyzed the underlying issues and suggest that the main causes of the overfitting problem include that the numbers of training cases were too small, and the features used in the models were not sufficiently informative for the task. We make recommendations on how to avoid overfitting in this important research area and improve the reliability and robustness of ML-based algorithms in general.
Bibliography:	ObjectType-Article-2 SourceType-Scholarly Journals-1 ObjectType-Feature-3 content type line 23 ObjectType-Review-1
ISSN:	1477-4054 1467-5463 1477-4054
DOI:	10.1093/bib/bbz071