Prediction of protein kinase-specific phosphorylation sites using Random forest algorithm
Reversible phosphorylation is an important procedure to control the activity of proteins in biological cellular regulatory processes. Experimental identification of kinase-specific phosphorylation sites in substrates is very costly in both time and labor. It is desirable to develop machine learning...
Saved in:
Published in | 2012 5th International Conference on Biomedical Engineering and Informatics pp. 986 - 989 |
---|---|
Main Authors | , , , |
Format | Conference Proceeding |
Language | English |
Published |
IEEE
01.10.2012
|
Subjects | |
Online Access | Get full text |
ISBN | 9781467311830 1467311839 |
DOI | 10.1109/BMEI.2012.6513035 |
Cover
Loading…
Summary: | Reversible phosphorylation is an important procedure to control the activity of proteins in biological cellular regulatory processes. Experimental identification of kinase-specific phosphorylation sites in substrates is very costly in both time and labor. It is desirable to develop machine learning methods which are rapid and effective for prediction. In this paper, we adopted Random forest (RF) algorithm for prediction of phosphorylation sites. Comparison with Bayesian Decision Theory (BDT) and Support Vector Machine (SVM) on four kinase/kinase family datasets showed RF consistent better performance. For example, on MAPK data RF algorithm achieved an AUC of 0.97, which was 0.04 and 0.03 higher than those of BDT and SVM, respectively. In addition, by maintaining a high specificity of 99%, the sensitivity of RF algorithm reached 66%, which was 25% and 23% higher than those of BDT and SVM, respectively. These results showed that RF is a powerful machine learning algorithm for protein phosphorylation site prediction. |
---|---|
ISBN: | 9781467311830 1467311839 |
DOI: | 10.1109/BMEI.2012.6513035 |