Meta-Analyzing Multiple Omics Data With Robust Variable Selection

High-throughput omics data are becoming more and more popular in various areas of science. Given that many publicly available datasets address the same questions, researchers have applied meta-analysis to synthesize multiple datasets to achieve more reliable results for model estimation and predicti...

Full description

Saved in:
Bibliographic Details
Published inFrontiers in genetics Vol. 12; p. 656826
Main Authors Hu, Zongliang, Zhou, Yan, Tong, Tiejun
Format Journal Article
LanguageEnglish
Published Switzerland Frontiers Media S.A 05.07.2021
Subjects
Online AccessGet full text

Cover

Loading…
More Information
Summary:High-throughput omics data are becoming more and more popular in various areas of science. Given that many publicly available datasets address the same questions, researchers have applied meta-analysis to synthesize multiple datasets to achieve more reliable results for model estimation and prediction. Due to the high dimensionality of omics data, it is also desirable to incorporate variable selection into meta-analysis. Existing meta-analyzing variable selection methods are often sensitive to the presence of outliers, and may lead to missed detections of relevant covariates, especially for lasso-type penalties. In this paper, we develop a robust variable selection algorithm for meta-analyzing high-dimensional datasets based on logistic regression. We first search an outlier-free subset from each dataset by borrowing information across the datasets with repeatedly use of the least trimmed squared estimates for the logistic model and together with a hierarchical bi-level variable selection technique. We then refine a reweighting step to further improve the efficiency after obtaining a reliable non-outlier subset. Simulation studies and real data analysis show that our new method can provide more reliable results than the existing meta-analysis methods in the presence of outliers.
Bibliography:ObjectType-Article-1
SourceType-Scholarly Journals-1
ObjectType-Feature-2
content type line 23
Reviewed by: Cen Wu, Kansas State University, United States; Duo Jiang, Oregon State University, United States
Edited by: Jiebiao Wang, University of Pittsburgh, United States
This article was submitted to Statistical Genetics and Methodology, a section of the journal Frontiers in Genetics
ISSN:1664-8021
1664-8021
DOI:10.3389/fgene.2021.656826