Removal of batch effects using distribution-matching residual networks

Sources of variability in experimentally derived data include measurement error in addition to the physical phenomena of interest. This measurement error is a combination of systematic components, originating from the measuring instrument and random measurement errors. Several novel biological techn...

Full description

Saved in:
Bibliographic Details
Published inBioinformatics (Oxford, England) Vol. 33; no. 16; pp. 2539 - 2546
Main Authors Shaham, Uri, Stanton, Kelly P, Zhao, Jun, Li, Huamin, Raddassi, Khadir, Montgomery, Ruth, Kluger, Yuval
Format Journal Article
LanguageEnglish
Published England Oxford University Press 15.08.2017
Subjects
Online AccessGet full text

Cover

Loading…
More Information
Summary:Sources of variability in experimentally derived data include measurement error in addition to the physical phenomena of interest. This measurement error is a combination of systematic components, originating from the measuring instrument and random measurement errors. Several novel biological technologies, such as mass cytometry and single-cell RNA-seq (scRNA-seq), are plagued with systematic errors that may severely affect statistical analysis if the data are not properly calibrated. We propose a novel deep learning approach for removing systematic batch effects. Our method is based on a residual neural network, trained to minimize the Maximum Mean Discrepancy between the multivariate distributions of two replicates, measured in different batches. We apply our method to mass cytometry and scRNA-seq datasets, and demonstrate that it effectively attenuates batch effects. our codes and data are publicly available at https://github.com/ushaham/BatchEffectRemoval.git. yuval.kluger@yale.edu. Supplementary data are available at Bioinformatics online.
Bibliography:ObjectType-Article-1
SourceType-Scholarly Journals-1
ObjectType-Feature-2
content type line 23
Uri Shaham and Kelly P. Stanton authors contributed equally.
ISSN:1367-4803
1367-4811
1367-4811
DOI:10.1093/bioinformatics/btx196