MultiDataSet: an R package for encapsulating multiple data sets with application to omic data integration

Background Reduction in the cost of genomic assays has generated large amounts of biomedical-related data. As a result, current studies perform multiple experiments in the same subjects. While Bioconductor’s methods and classes implemented in different packages manage individual experiments, there i...

Full description

Saved in:
Bibliographic Details
Published inBMC bioinformatics Vol. 18; no. 1; p. 36
Main Authors Hernandez-Ferrer, Carles, Ruiz-Arenas, Carlos, Beltran-Gomila, Alba, González, Juan R.
Format Journal Article
LanguageEnglish
Published London BioMed Central 17.01.2017
BioMed Central Ltd
Subjects
Online AccessGet full text
ISSN1471-2105
1471-2105
DOI10.1186/s12859-016-1455-1

Cover

Loading…
More Information
Summary:Background Reduction in the cost of genomic assays has generated large amounts of biomedical-related data. As a result, current studies perform multiple experiments in the same subjects. While Bioconductor’s methods and classes implemented in different packages manage individual experiments, there is not a standard class to properly manage different omic datasets from the same subjects. In addition, most R/Bioconductor packages that have been designed to integrate and visualize biological data often use basic data structures with no clear general methods, such as subsetting or selecting samples. Results To cover this need, we have developed MultiDataSet, a new R class based on Bioconductor standards, designed to encapsulate multiple data sets. MultiDataSet deals with the usual difficulties of managing multiple and non-complete data sets while offering a simple and general way of subsetting features and selecting samples. We illustrate the use of MultiDataSet in three common situations: 1) performing integration analysis with third party packages; 2) creating new methods and functions for omic data integration; 3) encapsulating new unimplemented data from any biological experiment. Conclusions MultiDataSet is a suitable class for data integration under R and Bioconductor framework.
Bibliography:ObjectType-Article-1
SourceType-Scholarly Journals-1
ObjectType-Feature-2
content type line 14
content type line 23
ISSN:1471-2105
1471-2105
DOI:10.1186/s12859-016-1455-1