跳到主要导航 跳到搜索 跳到主要内容

Independent Block-Wise Attribution for Vision Transformer Interpretability through Semantic Relevance

  • Nan Qi
  • , Peng Zhao
  • , Guiqin Wang
  • , Cong Zhao
  • , Shusen Yang
  • Xi'an Jiaotong University
  • Guangdong Artificial Intelligence and Digital Economy Laboratory - Guangzhou

科研成果: 期刊稿件文章同行评审

摘要

Transformers are increasingly becoming the dominant model in the field of computer vision, thereby catalyzing research efforts aimed at unraveling the interpretability of transformers. Existing explanation techniques, whether attention-based or gradient-based, furnish a dependable approach to quantifying the impact of input features on model predictions from the perspective of dissecting self-attention mechanisms. However, current research overlooks the block-to-block constraints, which result in misdirection in attribution. In this work, we propose a block-wise constraints-free interpretation method, Independent Block Level Attribution (IBA), which maintains the relative independence of each block in the model. The IBA reconfigures the model into mutually unaffected class-semantic blocks via class-semantic relevance, each of which performs the attribution computation independently, thus minimizing the influence of inter-block constraints on the model interpretation performance. Extensive perturbation and segmentation experiments unequivocally demonstrate the superiority of our method, showcasing its significant outperformance compared to current interpretation methods. Additionally, we also apply IBA to the text transformer to demonstrate the generalization of our method.

源语言英语
期刊IEEE Transactions on Multimedia
DOI
出版状态已接受/待刊 - 2026

学术指纹

探究 'Independent Block-Wise Attribution for Vision Transformer Interpretability through Semantic Relevance' 的科研主题。它们共同构成独一无二的指纹。

引用此