Balanced Functional Module Detection in genomic data

Motivation High-dimensional genomic data can be analyzed to understand the effects of variables on a target variable such as a clinical outcome. For understanding the underlying biological mechanism affecting the target, it is important to discover the complete set of relevant variables. Thus variab...

Full description

Saved in:
Bibliographic Details
Published inBioinformatics Advances Vol. 1; no. 1; p. vbab018
Main Authors Tritchler, David, Towle-Miller, Lorin M, Miecznikowski, Jeffrey C
Format Journal Article
LanguageEnglish
Published England Oxford University Press 2021
Subjects
Online AccessGet full text

Cover

Loading…
More Information
Summary:Motivation High-dimensional genomic data can be analyzed to understand the effects of variables on a target variable such as a clinical outcome. For understanding the underlying biological mechanism affecting the target, it is important to discover the complete set of relevant variables. Thus variable selection is a primary goal, which differs from a prediction criterion. Of special interest are functional modules, cooperating sets of variables affecting the target which can be characterized by a graph. In applications such as social networks, the concept of balance in undirected signed graphs characterizes the consistency of associations within the network. This property requires that the module variables have a joint effect on the target outcome with no internal conflict, an efficiency that may be applied to biological networks. Results In this paper, we model genomic variables in signed undirected graphs for applications where the set of predictor variables influences an outcome. Consequences of the balance property are exploited to implement a new module discovery algorithm, balanced Functional Module Detection (bFMD), which selects a subset of variables from high-dimensional data that compose a balanced functional module. Our bFMD algorithm performed favorably in simulations as compared to other module detection methods. Additionally, bFMD detected interpretable results in an application using RNA-seq data obtained from subjects with Uterine Corpus Endometrial Carcinoma using the percentage of tumor invasion as the outcome of interest. The variables selected by bFMD have improved interpretability due to the logical consistency afforded by the balance property. Supplementary information Supplementary data are available at Bioinformatics Advances online.
Bibliography:ObjectType-Article-1
SourceType-Scholarly Journals-1
ObjectType-Feature-2
content type line 14
content type line 23
ISSN:1367-4803
2635-0041
2635-0041
DOI:10.1093/bioadv/vbab018