FreqSelect: Frequency-Aware fMRI-to-Image Reconstruction
Reconstructing natural images from functional magnetic resonance imaging (fMRI) data remains a core challenge in natural decoding due to the mismatch between the richness of visual stimuli and the noisy, low resolution nature of fMRI signals. While recent two-stage models, combining deep variational...
Saved in:
Main Authors | , , |
---|---|
Format | Journal Article |
Language | English |
Published |
18.05.2025
|
Subjects | |
Online Access | Get full text |
DOI | 10.48550/arxiv.2505.12552 |
Cover
Loading…
Summary: | Reconstructing natural images from functional magnetic resonance imaging
(fMRI) data remains a core challenge in natural decoding due to the mismatch
between the richness of visual stimuli and the noisy, low resolution nature of
fMRI signals. While recent two-stage models, combining deep variational
autoencoders (VAEs) with diffusion models, have advanced this task, they treat
all spatial-frequency components of the input equally. This uniform treatment
forces the model to extract meaning features and suppress irrelevant noise
simultaneously, limiting its effectiveness. We introduce FreqSelect, a
lightweight, adaptive module that selectively filters spatial-frequency bands
before encoding. By dynamically emphasizing frequencies that are most
predictive of brain activity and suppressing those that are uninformative,
FreqSelect acts as a content-aware gate between image features and natural
data. It integrates seamlessly into standard very deep VAE-diffusion pipelines
and requires no additional supervision. Evaluated on the Natural Scenes
dataset, FreqSelect consistently improves reconstruction quality across both
low- and high-level metrics. Beyond performance gains, the learned
frequency-selection patterns offer interpretable insights into how different
visual frequencies are represented in the brain. Our method generalizes across
subjects and scenes, and holds promise for extension to other neuroimaging
modalities, offering a principled approach to enhancing both decoding accuracy
and neuroscientific interpretability. |
---|---|
DOI: | 10.48550/arxiv.2505.12552 |