Prediction of protein kinase-specific phosphorylation sites using Random forest algorithm

Reversible phosphorylation is an important procedure to control the activity of proteins in biological cellular regulatory processes. Experimental identification of kinase-specific phosphorylation sites in substrates is very costly in both time and labor. It is desirable to develop machine learning...

Full description

Saved in:
Bibliographic Details
Published in2012 5th International Conference on Biomedical Engineering and Informatics pp. 986 - 989
Main Authors Fan, Wenwen, Zou, Liang, Li, Ao, Wang, Minghui
Format Conference Proceeding
LanguageEnglish
Published IEEE 01.10.2012
Subjects
Online AccessGet full text
ISBN9781467311830
1467311839
DOI10.1109/BMEI.2012.6513035

Cover

Loading…
More Information
Summary:Reversible phosphorylation is an important procedure to control the activity of proteins in biological cellular regulatory processes. Experimental identification of kinase-specific phosphorylation sites in substrates is very costly in both time and labor. It is desirable to develop machine learning methods which are rapid and effective for prediction. In this paper, we adopted Random forest (RF) algorithm for prediction of phosphorylation sites. Comparison with Bayesian Decision Theory (BDT) and Support Vector Machine (SVM) on four kinase/kinase family datasets showed RF consistent better performance. For example, on MAPK data RF algorithm achieved an AUC of 0.97, which was 0.04 and 0.03 higher than those of BDT and SVM, respectively. In addition, by maintaining a high specificity of 99%, the sensitivity of RF algorithm reached 66%, which was 25% and 23% higher than those of BDT and SVM, respectively. These results showed that RF is a powerful machine learning algorithm for protein phosphorylation site prediction.
ISBN:9781467311830
1467311839
DOI:10.1109/BMEI.2012.6513035