Random forest for gene selection and microarray data classification

A random forest method has been selected to perform both gene selection and classification of the microarray data. In this embedded method, the selection of smallest possible sets of genes with lowest error rates is the key factor in achieving highest classification accuracy. Hence, improved gene se...

Full description

Saved in:

Bibliographic Details
Published in	Bioinformation Vol. 7; no. 3; pp. 142 - 146
Main Authors	Moorthy, Kohbalan, Mohamad, Mohd Saberi
Format	Journal Article
Language	English
Published	Singapore Biomedical Informatics 01.01.2011
Subjects	Hypothesis gene expression data gene selection Random forest cancer classification classification microarray data
Online Access	Get full text

Cover

Loading…

More Information
Summary:	A random forest method has been selected to perform both gene selection and classification of the microarray data. In this embedded method, the selection of smallest possible sets of genes with lowest error rates is the key factor in achieving highest classification accuracy. Hence, improved gene selection method using random forest has been proposed to obtain the smallest subset of genes as well as biggest subset of genes prior to classification. The option for biggest subset selection is done to assist researchers who intend to use the informative genes for further research. Enhanced random forest gene selection has performed better in terms of selecting the smallest subset as well as biggest subset of informative genes with lowest out of bag error rates through gene selection. Furthermore, the classification performed on the selected subset of genes using random forest has lead to lower prediction error rates compared to existing method and other similar available methods.
Bibliography:	ObjectType-Article-2 SourceType-Scholarly Journals-1 ObjectType-Feature-1 content type line 23 ObjectType-Article-1 ObjectType-Feature-2
ISSN:	0973-8894 0973-2063 0973-2063
DOI:	10.6026/97320630007142