Counterfactual contrastive learning: robust representations via causal image synthesis

Contrastive pretraining is well-known to improve downstream task performance and model generalisation, especially in limited label settings. However, it is sensitive to the choice of augmentation pipeline. Positive pairs should preserve semantic information while destroying domain-specific informati...

Full description

Saved in:
Bibliographic Details
Published inarXiv.org
Main Authors Roschewitz, Melanie, De Sousa Ribeiro, Fabio, Tian Xia, Galvin Khara, Glocker, Ben
Format Paper Journal Article
LanguageEnglish
Published Ithaca Cornell University Library, arXiv.org 17.09.2024
Subjects
Online AccessGet full text

Cover

Loading…
More Information
Summary:Contrastive pretraining is well-known to improve downstream task performance and model generalisation, especially in limited label settings. However, it is sensitive to the choice of augmentation pipeline. Positive pairs should preserve semantic information while destroying domain-specific information. Standard augmentation pipelines emulate domain-specific changes with pre-defined photometric transformations, but what if we could simulate realistic domain changes instead? In this work, we show how to utilise recent progress in counterfactual image generation to this effect. We propose CF-SimCLR, a counterfactual contrastive learning approach which leverages approximate counterfactual inference for positive pair creation. Comprehensive evaluation across five datasets, on chest radiography and mammography, demonstrates that CF-SimCLR substantially improves robustness to acquisition shift with higher downstream performance on both in- and out-of-distribution data, particularly for domains which are under-represented during training.
ISSN:2331-8422
DOI:10.48550/arxiv.2403.09605