SIFLoc: a self-supervised pre-training method for enhancing the recognition of protein subcellular localization in immunofluorescence microscopic images

Abstract With the rapid growth of high-resolution microscopy imaging data, revealing the subcellular map of human proteins has become a central task in the spatial proteome. The cell atlas of the Human Protein Atlas (HPA) provides precious resources for recognizing subcellular localization patterns...

Full description

Saved in:
Bibliographic Details
Published inBriefings in bioinformatics Vol. 23; no. 2
Main Authors Tu, Yanlun, Lei, Houchao, Shen, Hong-Bin, Yang, Yang
Format Journal Article
LanguageEnglish
Published England Oxford University Press 10.03.2022
Oxford Publishing Limited (England)
Subjects
Online AccessGet full text

Cover

Loading…
More Information
Summary:Abstract With the rapid growth of high-resolution microscopy imaging data, revealing the subcellular map of human proteins has become a central task in the spatial proteome. The cell atlas of the Human Protein Atlas (HPA) provides precious resources for recognizing subcellular localization patterns at the cell level, and the large-scale annotated data enable learning via advanced deep neural networks. However, the existing predictors still suffer from the imbalanced class distribution and the lack of labeled data for minor classes. Thus, it is necessary to develop new methods for coping with these issues. We leverage the self-supervised learning protocol to address these problems. Especially, we propose a pre-training scheme to enhance the conventional supervised learning framework called SIFLoc. The pre-training is featured by a hybrid data augmentation method and a modified contrastive loss function, aiming to learn good feature representations from microscopic images. The experiments are performed on a large-scale immunofluorescence microscopic image dataset collected from the HPA database. Using the same deep neural networks as the classifier, the model pre-trained via SIFLoc not only outperforms the model without pre-training by a large margin but also shows advantages over the state-of-the-art self-supervised learning methods. Especially, SIFLoc improves the prediction accuracy for minor organelles significantly.
Bibliography:ObjectType-Article-1
SourceType-Scholarly Journals-1
ObjectType-Feature-2
content type line 23
ISSN:1467-5463
1477-4054
DOI:10.1093/bib/bbab605