More Synergy, Less Redundancy: Exploiting Joint Mutual Information for Self-Supervised Learning

Self-supervised learning (SSL) is now a serious competitor for supervised learning, even though it does not require data annotation. Several baselines have attempted to make SSL models exploit information about data distribution, and less dependent on the augmentation effect. However, there is no cl...

Full description

Saved in:

Bibliographic Details
Published in	2023 IEEE International Conference on Image Processing (ICIP) pp. 1390 - 1394
Main Authors	Mohamadi, Salman, Doretto, Gianfranco, Adjeroh, Donald A.
Format	Conference Proceeding
Language	English
Published	IEEE 08.10.2023
Subjects	Degradation Image processing Protocols Redundancy Reliability Self-supervised learning Training
Online Access	Get full text

Cover

Loading…

More Information
Summary:	Self-supervised learning (SSL) is now a serious competitor for supervised learning, even though it does not require data annotation. Several baselines have attempted to make SSL models exploit information about data distribution, and less dependent on the augmentation effect. However, there is no clear consensus on whether maximizing or minimizing the mutual information between representations of augmentation views practically contribute to improvement or degradation in performance of SSL models. This paper is a fundamental work where, we investigate the role of mutual information in SSL, and reformulate the problem of SSL in the context of a new perspective on mutual information. To this end, we consider joint mutual information from the perspective of partial information decomposition (PID) as a key step in reliable multivariate information measurement. PID enables us to decompose joint mutual information into three important components, namely, unique information, redundant information and synergistic information. Our framework aims for minimizing the redundant information between views and the desired target representation while maximizing the synergistic information at the same time. Our experiments lead to a re-calibration of two redundancy reduction baselines, and a proposal for a new SSL training protocol. Experimental results on multiple datasets and two downstream tasks show the effectiveness of this framework.
DOI:	10.1109/ICIP49359.2023.10222547