TY - JOUR
T1 - NeuralDiffuser
T2 - Neuroscience-Inspired Diffusion Guidance for fMRI Visual Reconstruction
AU - Li, Haoyu
AU - Wu, Hao
AU - Chen, Badong
N1 - Publisher Copyright:
© 2025 IEEE.
PY - 2025
Y1 - 2025
N2 - Reconstructing visual stimuli from functional Magnetic Resonance Imaging (fMRI) enables fine-grained retrieval of brain activity. However, the accurate reconstruction of diverse details, including structure, background, texture, color, and more, remains challenging. The stable diffusion models inevitably result in the variability of reconstructed images, even under identical conditions. To address this challenge, we first uncover the neuroscientific perspective of diffusion methods, which primarily involve top-down creation using pre-trained knowledge from extensive image datasets, but tend to lack detail-driven bottom-up perception, leading to a loss of faithful details. In this paper, we propose NeuralDiffuser, which incorporates primary visual feature guidance to provide detailed cues in the form of gradients. This extension of the bottom-up process for diffusion models achieves both semantic coherence and detail fidelity when reconstructing visual stimuli. Furthermore, we have developed a novel guidance strategy for reconstruction tasks that ensures the consistency of repeated outputs with original images rather than with various outputs. Extensive experimental results on the Natural Senses Dataset (NSD) qualitatively and quantitatively demonstrate the advancement of NeuralDiffuser by comparing it against baseline and state-of-the-art methods horizontally, as well as conducting longitudinal ablation studies. Code can be available on https://github.com/HaoyyLi/NeuralDiffuser.
AB - Reconstructing visual stimuli from functional Magnetic Resonance Imaging (fMRI) enables fine-grained retrieval of brain activity. However, the accurate reconstruction of diverse details, including structure, background, texture, color, and more, remains challenging. The stable diffusion models inevitably result in the variability of reconstructed images, even under identical conditions. To address this challenge, we first uncover the neuroscientific perspective of diffusion methods, which primarily involve top-down creation using pre-trained knowledge from extensive image datasets, but tend to lack detail-driven bottom-up perception, leading to a loss of faithful details. In this paper, we propose NeuralDiffuser, which incorporates primary visual feature guidance to provide detailed cues in the form of gradients. This extension of the bottom-up process for diffusion models achieves both semantic coherence and detail fidelity when reconstructing visual stimuli. Furthermore, we have developed a novel guidance strategy for reconstruction tasks that ensures the consistency of repeated outputs with original images rather than with various outputs. Extensive experimental results on the Natural Senses Dataset (NSD) qualitatively and quantitatively demonstrate the advancement of NeuralDiffuser by comparing it against baseline and state-of-the-art methods horizontally, as well as conducting longitudinal ablation studies. Code can be available on https://github.com/HaoyyLi/NeuralDiffuser.
KW - Stable diffusion
KW - fMRI visual reconstruction
KW - guided diffusion
KW - natural senses dataset (NSD)
KW - neuroimaging
UR - https://www.scopus.com/pages/publications/85215011535
U2 - 10.1109/TIP.2025.3526051
DO - 10.1109/TIP.2025.3526051
M3 - 文章
AN - SCOPUS:85215011535
SN - 1057-7149
VL - 34
SP - 552
EP - 565
JO - IEEE Transactions on Image Processing
JF - IEEE Transactions on Image Processing
ER -