GUEST: an R package for handling estimation of graphical structure and multiclassification for error-prone gene expression data

Summary In bioinformatics studies, understanding the network structure of gene expression variables is one of the main interests. In the framework of data science, graphical models have been widely used to characterize the dependence structure among multivariate random variables. However, the gene e...

Full description

Saved in:

Bibliographic Details
Published in	Bioinformatics (Oxford, England) Vol. 40; no. 12
Main Authors	Chen, Li-Pang, Tsao, Hui-Shan
Format	Journal Article
Language	English
Published	England Oxford University Press 28.11.2024 Oxford Publishing Limited (England)
Subjects	Algorithms Applications Note Availability Bioinformatics Computational Biology - methods Computer graphics Covariance matrix Data science Discriminant Analysis Error detection Gene Expression Gene Expression Profiling - methods Humans Machine learning Random variables Software Supervised learning
Online Access	Get full text

Cover

Loading…

More Information
Summary:	Summary In bioinformatics studies, understanding the network structure of gene expression variables is one of the main interests. In the framework of data science, graphical models have been widely used to characterize the dependence structure among multivariate random variables. However, the gene expression data possibly suffer from ultrahigh-dimensionality and measurement error, which make the detection of network structure challenging and difficult. The other important application of gene expression variables is to provide information to classify subjects into various tumors or diseases. In supervised learning, while linear discriminant analysis is a commonly used approach, the conventional implementation is limited in precisely measured variables and computation of their inverse covariance matrix, which is known as the precision matrix. To tackle those challenges and provide a reliable estimation procedure for public use, we develop the R package GUEST, which is known as Graphical models for Ultrahigh-dimensional and Error-prone data by the booSTing algorithm. This R package aims to deal with measurement error effects in high-dimensional variables under various distributions and then applies the boosting algorithm to identify the network structure and estimate the precision matrix. When the precision matrix is estimated, it can be used to construct the linear discriminant function and improve the accuracy of the classification. Availability and implementation The R package is available on https://cran.r-project.org/web/packages/GUEST/index.html.
Bibliography:	SourceType-Scholarly Journals-1 content type line 14 ObjectType-Report-1 ObjectType-Article-1 ObjectType-Feature-2 content type line 23
ISSN:	1367-4811 1367-4803 1367-4811
DOI:	10.1093/bioinformatics/btae731