A Method for Identifying Vesicle Transport Proteins Based on LibSVM and MRMD

With the development of computer technology, many machine learning algorithms have been applied to the field of biology, forming the discipline of bioinformatics. Protein function prediction is a classic research topic in this subject area. Though many scholars have made achievements in identifying...

Full description

Saved in:
Bibliographic Details
Published inComputational and mathematical methods in medicine Vol. 2020; no. 2020; pp. 1 - 9
Main Authors Zhao, Yuming, Teng, Zhixia, Li, Yanjuan, Tao, Zhiyu
Format Journal Article
LanguageEnglish
Published Cairo, Egypt Hindawi Publishing Corporation 2020
Hindawi
Online AccessGet full text
ISSN1748-670X
1748-6718
1748-6718
DOI10.1155/2020/8926750

Cover

Loading…
More Information
Summary:With the development of computer technology, many machine learning algorithms have been applied to the field of biology, forming the discipline of bioinformatics. Protein function prediction is a classic research topic in this subject area. Though many scholars have made achievements in identifying protein by different algorithms, they often extract a large number of feature types and use very complex classification methods to obtain little improvement in the classification effect, and this process is very time-consuming. In this research, we attempt to utilize as few features as possible to classify vesicular transportation proteins and to simultaneously obtain a comparative satisfactory classification result. We adopt CTDC which is a submethod of the method of composition, transition, and distribution (CTD) to extract only 39 features from each sequence, and LibSVM is used as the classification method. We use the SMOTE method to deal with the problem of dataset imbalance. There are 11619 protein sequences in our dataset. We selected 4428 sequences to train our classification model and selected other 1832 sequences from our dataset to test the classification effect and finally achieved an accuracy of 71.77%. After dimension reduction by MRMD, the accuracy is 72.16%.
Bibliography:ObjectType-Article-1
SourceType-Scholarly Journals-1
ObjectType-Feature-2
content type line 23
Academic Editor: Hui Ding
ISSN:1748-670X
1748-6718
1748-6718
DOI:10.1155/2020/8926750