Estimation of large block structured covariance matrices: Application to ‘multi‐omic’ approaches to study seed quality

Motivated by an application in high‐throughput genomics and metabolomics, we propose a novel and fully data‐driven approach for estimating large block structured sparse covariance matrices in the case where the number of variables is much larger than the number of samples without limiting ourselves...

Full description

Saved in:
Bibliographic Details
Published inJournal of the Royal Statistical Society Series C: Applied Statistics Vol. 71; no. 1; pp. 119 - 147
Main Authors Perrot‐Dockès, M., Lévy‐Leduc, C., Rajjou, L.
Format Journal Article
LanguageEnglish
Published Oxford Oxford University Press 01.01.2022
Wiley
Subjects
Online AccessGet full text

Cover

Loading…
More Information
Summary:Motivated by an application in high‐throughput genomics and metabolomics, we propose a novel and fully data‐driven approach for estimating large block structured sparse covariance matrices in the case where the number of variables is much larger than the number of samples without limiting ourselves to block diagonal matrices. Our approach consists in approximating such a covariance matrix by the sum of a low‐rank sparse matrix and a diagonal matrix. Our methodology also can deal with matrices for which the block structure appears only if the columns and rows are permuted according to an unknown permutation. Our technique is implemented in the R package BlockCov which is available from the Comprehensive R Archive Network (CRAN) and from GitHub. In order to illustrate the statistical and numerical performance of our package some numerical experiments are provided as well as a thorough comparison with alternative methods. Finally, our approach is applied to the use of ‘multi‐omic’ approaches for studying seed quality.
ISSN:0035-9254
1467-9876
DOI:10.1111/rssc.12524