Biomarker discovery and visualization in gene expression data with efficient generalized matrix approximations

In most real-world gene expression data sets, there are often multiple sample classes with ordinals, which are categorized into the normal or diseased type. The traditional feature or attribute selection methods consider multiple classes equally without paying attention to the up/down regulation acr...

Full description

Saved in:

Bibliographic Details
Published in	Journal of bioinformatics and computational biology Vol. 5; no. 2a; p. 251
Main Authors	Li, Wenyuan, Peng, Yanxiong, Huang, Hung-Chung, Liu, Ying
Format	Journal Article
Language	English
Published	Singapore 01.04.2007
Subjects	Algorithms Biomarkers - metabolism Computer Graphics Computer Simulation Databases, Protein Gene Expression Profiling - methods Humans Models, Biological Oligonucleotide Array Sequence Analysis - methods User-Computer Interface
Online Access	Get more information

Cover

Loading…

More Information
Summary:	In most real-world gene expression data sets, there are often multiple sample classes with ordinals, which are categorized into the normal or diseased type. The traditional feature or attribute selection methods consider multiple classes equally without paying attention to the up/down regulation across the normal and diseased types of classes, while the specific gene selection methods particularly consider the differential expressions across the normal and diseased, but ignore the existence of multiple classes. In this paper, to improve the biomarker discovery, we propose to make the best use of these two aspects: the differential expressions (that can be viewed as the domain knowledge of gene expression data) and the multiple classes (that can be viewed as a kind of data set characteristic). Therefore, we simultaneously take into account these two aspects by employing the 1-rank generalized matrix approximations (GMA). Our results show that GMA cannot only improve the accuracy of classifying the samples, but also provide a visualization method to effectively analyze the gene expression data on both genes and samples. Based on the mechanism of matrix approximation, we further propose an algorithm, CBiomarker, to discover compact biomarker by reducing the redundancy.
ISSN:	0219-7200
DOI:	10.1142/S0219720007002746