FuseNet: Self-Supervised Dual-Path Network for Medical Image Segmentation
Semantic segmentation, a crucial task in computer vision, often relies on labor-intensive and costly annotated datasets for training. In response to this challenge, we introduce FuseNet, a dual-stream framework for self-supervised semantic segmentation that eliminates the need for manual annotation....
Saved in:
Main Authors | , , , , , |
---|---|
Format | Journal Article |
Language | English |
Published |
21.11.2023
|
Subjects | |
Online Access | Get full text |
Cover
Loading…
Summary: | Semantic segmentation, a crucial task in computer vision, often relies on
labor-intensive and costly annotated datasets for training. In response to this
challenge, we introduce FuseNet, a dual-stream framework for self-supervised
semantic segmentation that eliminates the need for manual annotation. FuseNet
leverages the shared semantic dependencies between the original and augmented
images to create a clustering space, effectively assigning pixels to
semantically related clusters, and ultimately generating the segmentation map.
Additionally, FuseNet incorporates a cross-modal fusion technique that extends
the principles of CLIP by replacing textual data with augmented images. This
approach enables the model to learn complex visual representations, enhancing
robustness against variations similar to CLIP's text invariance. To further
improve edge alignment and spatial consistency between neighboring pixels, we
introduce an edge refinement loss. This loss function considers edge
information to enhance spatial coherence, facilitating the grouping of nearby
pixels with similar visual features. Extensive experiments on skin lesion and
lung segmentation datasets demonstrate the effectiveness of our method.
\href{https://github.com/xmindflow/FuseNet}{Codebase.} |
---|---|
DOI: | 10.48550/arxiv.2311.13069 |