A Semi-Supervised Learning Method to Remedy the Lack of Labeled Data

Contrastive learning has been attracting increasing attention in various research domains recently. There have been some works studying the effectiveness of representation learned by contrastive learning on ImageNet which gives good results for many other vision problems. However, studying the perfo...

Full description

Saved in:
Bibliographic Details
Published in2021 15th International Conference on Advanced Computing and Applications (ACOMP) pp. 78 - 84
Main Authors Nguyen, Nhut-Quang, Le, Thanh-Sach
Format Conference Proceeding
LanguageEnglish
Published IEEE 01.11.2021
Subjects
Online AccessGet full text

Cover

Loading…
More Information
Summary:Contrastive learning has been attracting increasing attention in various research domains recently. There have been some works studying the effectiveness of representation learned by contrastive learning on ImageNet which gives good results for many other vision problems. However, studying the performance of representation learned by contrastive learning on other datasets has not been analysed deeply and systematically. In this study, we conduct comprehensive experiments to fill this important gap and combine those results with the analysis of distribution distance relationship between datasets. We Figure out a negative correlation between the distribution distance and the effectiveness of representation. These observations motivate us to introduce a new semi-supervised framework called Self-Contrastively Supervised Learning (SelfCSL) that uses data from the same domain of the current problem to build a pre-trained model via contrastive learning. By using the data source of the problem itself, the learned pre-trained model has unique characteristics for the problem, thereby increasing efficiency and stability. To test the method, we performed an evaluation on the MedMNIST dataset - a set of 10 pre-processed medical open datasets. The proposed method offers higher classification AUC compared to the model initialized by ImageNet in 5/10 datasets and higher stability in 9/10 datasets.
ISSN:2688-0202
DOI:10.1109/ACOMP53746.2021.00017