Equivariance Allows Handling Multiple Nuisance Variables When Analyzing Pooled Neuroimaging Datasets

Pooling multiple neuroimaging datasets across institutions often enables improvements in statistical power when evaluating associations (e.g., between risk factors and disease outcomes) that may otherwise be too weak to detect. When there is only a single source of variability (e.g., different scann...

Full description

Saved in:

Bibliographic Details
Published in	Proceedings (IEEE Computer Society Conference on Computer Vision and Pattern Recognition. Online) Vol. 2022; pp. 10422 - 10431
Main Authors	Lokhande, Vishnu Suresh, Chakraborty, Rudrasis, Ravi, Sathya N., Singh, Vikas
Format	Conference Proceeding Journal Article
Language	English
Published	United States IEEE 01.06.2022
Subjects	accountability Atmospheric measurements Codes Computer vision fairness Neural networks Neuroimaging Particle measurements privacy and ethics in vision; Optimization methods; Others; Representation learning; Statistical methods Representation learning Transparency
Online Access	Get full text

Cover

Loading…

More Information
Summary:	Pooling multiple neuroimaging datasets across institutions often enables improvements in statistical power when evaluating associations (e.g., between risk factors and disease outcomes) that may otherwise be too weak to detect. When there is only a single source of variability (e.g., different scanners), domain adaptation and matching the distributions of representations may suffice in many scenarios. But in the presence of more than one nuisance variable which concurrently influence the measurements, pooling datasets poses unique challenges, e.g., variations in the data can come from both the acquisition method as well as the demographics of participants (gender, age). Invariant representation learning, by itself, is illsuited to fully model the data generation process. In this paper, we show how bringing recent results on equivariant representation learning (for studying symmetries in neural networks) instantiated on structured spaces together with simple use of classical results on causal inference provides an effective practical solution. In particular, we demonstrate how our model allows dealing with more than one nuisance variable under some assumptions and can enable analysis of pooled scientific datasets in scenarios that would otherwise entail removing a large portion of the samples. Our code is available on https://github.com/vsingh-group/DatasetPooling.
Bibliography:	ObjectType-Article-1 SourceType-Scholarly Journals-1 ObjectType-Feature-2 content type line 23
ISSN:	1063-6919 1063-6919 2575-7075
DOI:	10.1109/CVPR52688.2022.01018