Biomarker discovery and visualization in gene expression data with efficient generalized matrix approximations
In most real-world gene expression data sets, there are often multiple sample classes with ordinals, which are categorized into the normal or diseased type. The traditional feature or attribute selection methods consider multiple classes equally without paying attention to the up/down regulation acr...
Saved in:
Published in | Journal of bioinformatics and computational biology Vol. 5; no. 2a; p. 251 |
---|---|
Main Authors | , , , |
Format | Journal Article |
Language | English |
Published |
Singapore
01.04.2007
|
Subjects | |
Online Access | Get more information |
Cover
Loading…
Summary: | In most real-world gene expression data sets, there are often multiple sample classes with ordinals, which are categorized into the normal or diseased type. The traditional feature or attribute selection methods consider multiple classes equally without paying attention to the up/down regulation across the normal and diseased types of classes, while the specific gene selection methods particularly consider the differential expressions across the normal and diseased, but ignore the existence of multiple classes. In this paper, to improve the biomarker discovery, we propose to make the best use of these two aspects: the differential expressions (that can be viewed as the domain knowledge of gene expression data) and the multiple classes (that can be viewed as a kind of data set characteristic). Therefore, we simultaneously take into account these two aspects by employing the 1-rank generalized matrix approximations (GMA). Our results show that GMA cannot only improve the accuracy of classifying the samples, but also provide a visualization method to effectively analyze the gene expression data on both genes and samples. Based on the mechanism of matrix approximation, we further propose an algorithm, CBiomarker, to discover compact biomarker by reducing the redundancy. |
---|---|
ISSN: | 0219-7200 |
DOI: | 10.1142/S0219720007002746 |