Prediction of MicroRNA Precursors Using Parsimonious Feature Sets

MicroRNAs (miRNAs) are a class of short noncoding RNAs that regulate gene expression through base pairing with messenger RNAs. Due to the interest in studying miRNA dysregulation in disease and limits of validated miRNA references, identification of novel miRNAs is a critical task. The performance o...

Full description

Saved in:
Bibliographic Details
Published inCancer Informatics Vol. 2014; no. Suppl. 1; pp. 95 - 102
Main Authors Stepanowsky, Petra, Levy, Eric, Kim, Jihoon, Jiang, Xiaoqian, Ohno-Machado, Lucila
Format Journal Article Book Review
LanguageEnglish
Published London, England SAGE Publishing 01.01.2014
SAGE Publications
Sage Publications Ltd. (UK)
Libertas Academica
Subjects
Online AccessGet full text

Cover

Loading…
More Information
Summary:MicroRNAs (miRNAs) are a class of short noncoding RNAs that regulate gene expression through base pairing with messenger RNAs. Due to the interest in studying miRNA dysregulation in disease and limits of validated miRNA references, identification of novel miRNAs is a critical task. The performance of different models to predict novel miRNAs varies with the features chosen as predictors. However, no study has systematically compared published feature sets. We constructed a comprehensive feature set using the minimum free energy of the secondary structure of precursor miRNAs, a set of nucleotide-structure triplets, and additional extracted sequence and structure characteristics. We then compared the predictive value of our comprehensive feature set to those from three previously published studies, using logistic regression and random forest classifiers. We found that classifiers containing as few as seven highly predictive features are able to predict novel precursor miRNAs as well as classifiers that use larger feature sets. In a real data set, our method correctly identified the holdout miRNAs relevant to renal cancer.
Bibliography:ObjectType-Article-2
SourceType-Scholarly Journals-1
ObjectType-Feature-3
content type line 23
ObjectType-Review-1
ISSN:1176-9351
1176-9351
DOI:10.4137/CIN.S13877