Cross-Subject Reference Attention for Brain Lesion Segmentation

Brain lesion segmentation serves as a useful tool for clinical diagnosis and scientific research. The Transformer architecture has achieved remarkable performance in the field of computer vision and can potentially benefit brain lesion segmentation. Standard Transformer models partition an image int...

Full description

Saved in:

Bibliographic Details
Published in	2025 IEEE 2nd International Conference on Deep Learning and Computer Vision (DLCV) pp. 1 - 5
Main Authors	Jiang, Runze, Ye, Chuyang
Format	Conference Proceeding
Language	English
Published	IEEE 06.06.2025
Subjects	Accuracy Attention Brain lesion segmentation Brain modeling Computational modeling Computer vision Cross-subject reference Image coding Image segmentation Lesions Navigation Transformers
Online Access	Get full text
DOI	10.1109/DLCV65218.2025.11088861

Cover

Loading…

More Information
Summary:	Brain lesion segmentation serves as a useful tool for clinical diagnosis and scientific research. The Transformer architecture has achieved remarkable performance in the field of computer vision and can potentially benefit brain lesion segmentation. Standard Transformer models partition an image into patches and compute the self-attention matrix between these patches to model long-range dependencies. However, as brain lesions usually occupy only a small portion of the image, self-attention is computed mostly between patches of lesions and healthy tissue. Consequently, there is a lack of long-range dependency that is relevant to brain lesion segmentation, and existing Transformer models have not achieved a clear improvement over convolution-based methods, such as nnU-Net. To address the limitation above, we propose a cross-subject reference attention (CRA) mechanism for brain lesion segmentation. CRA exploits the information from not only the image to segment but also reference lesion information from other images, which can provide relevant long-range dependency. Specifically, CRA consists of a subject-level tokenizer, a reference pool, and a hierarchical cross-attention module. The subject-level tokenizer first maps the input feature map of each subject into a fixed number of tokens, which compresses the rich image information to reduce the subsequent computation overhead. Then, the tokens of selected subjects are stored in the reference pool as reference tokens. Finally, in the hierarchical cross-attention image features are adjusted with adaptive guidance of reference tokens, which alleviates the lack of lesion patches in a single image. CRA is agnostic to the segmentation backbone and we integrate it with the state-of-the-art nnU-Net framework. To evaluate the proposed method, we performed experiments on multiple datasets, and the results indicate that CRA leads to improved accuracy of brain lesion segmentation.
DOI:	10.1109/DLCV65218.2025.11088861