Survey on Nucleotide Encoding Techniques and SVM Kernel Design for Human Splice Site Prediction

Splice site prediction in DNA sequence is a basic search problem for finding exon/intron and intron/exon boundaries. Removing introns and then joining the exons together forms the mRNA sequence. These sequences are the input of the translation process. It is a necessary step in the central dogma of...

Full description

Saved in:
Bibliographic Details
Published inInterdisciplinary bio central Vol. 4; no. 4; pp. 14.1 - 14.6
Main Authors Bari, A.T.M. Golam, Reaz, Mst. Rokeya, Choi, Ho-Jin, Jeong, Byeong-Soo
Format Journal Article
LanguageKorean
Published 2012
Subjects
Online AccessGet full text

Cover

Loading…
More Information
Summary:Splice site prediction in DNA sequence is a basic search problem for finding exon/intron and intron/exon boundaries. Removing introns and then joining the exons together forms the mRNA sequence. These sequences are the input of the translation process. It is a necessary step in the central dogma of molecular biology. The main task of splice site prediction is to find out the exact GT and AG ended sequences. Then it identifies the true and false GT and AG ended sequences among those candidate sequences. In this paper, we survey research works on splice site prediction based on support vector machine (SVM). The basic difference between these research works is nucleotide encoding technique and SVM kernel selection. Some methods encode the DNA sequence in a sparse way whereas others encode in a probabilistic manner. The encoded sequences serve as input of SVM. The task of SVM is to classify them using its learning model. The accuracy of classification largely depends on the proper kernel selection for sequence data as well as a selection of kernel parameter. We observe each encoding technique and classify them according to their similarity. Then we discuss about kernel and their parameter selection. Our survey paper provides a basic understanding of encoding approaches and proper kernel selection of SVM for splice site prediction.
Bibliography:KISTI1.1003/JNL.JAKO201210635653679
ISSN:2005-8543