Sound reconstruction from human brain activity via a generative model with brain-like auditory features
The successful reconstruction of perceptual experiences from human brain activity has provided insights into the neural representations of sensory experiences. However, reconstructing arbitrary sounds has been avoided due to the complexity of temporal sequences in sounds and the limited resolution o...
Saved in:
Main Authors | , , , |
---|---|
Format | Journal Article |
Language | English |
Published |
20.06.2023
|
Subjects | |
Online Access | Get full text |
Cover
Loading…
Summary: | The successful reconstruction of perceptual experiences from human brain
activity has provided insights into the neural representations of sensory
experiences. However, reconstructing arbitrary sounds has been avoided due to
the complexity of temporal sequences in sounds and the limited resolution of
neuroimaging modalities. To overcome these challenges, leveraging the
hierarchical nature of brain auditory processing could provide a path toward
reconstructing arbitrary sounds. Previous studies have indicated a hierarchical
homology between the human auditory system and deep neural network (DNN)
models. Furthermore, advancements in audio-generative models enable to
transform compressed representations back into high-resolution sounds. In this
study, we introduce a novel sound reconstruction method that combines brain
decoding of auditory features with an audio-generative model. Using fMRI
responses to natural sounds, we found that the hierarchical sound features of a
DNN model could be better decoded than spectrotemporal features. We then
reconstructed the sound using an audio transformer that disentangled compressed
temporal information in the decoded DNN features. Our method shows
unconstrained sounds reconstruction capturing sound perceptual contents and
quality and generalizability by reconstructing sound categories not included in
the training dataset. Reconstructions from different auditory regions remain
similar to actual sounds, highlighting the distributed nature of auditory
representations. To see whether the reconstructions mirrored actual subjective
perceptual experiences, we performed an experiment involving selective auditory
attention to one of overlapping sounds. The results tended to resemble the
attended sound than the unattended. These findings demonstrate that our
proposed model provides a means to externalize experienced auditory contents
from human brain activity. |
---|---|
DOI: | 10.48550/arxiv.2306.11629 |