跳到主要导航 跳到搜索 跳到主要内容

Improving query efficiency of black-box attacks via the preference of deep learning models

  • Xiangyuan Yang
  • , Jie Lin
  • , Hanlin Zhang
  • , Peng Zhao
  • Xi'an Jiaotong University
  • Qingdao University

科研成果: 期刊稿件文章同行评审

2 引用 (Scopus)

摘要

Black-box query attacks are effective at compromising deep-learning models using only the model's output. These attacks typically face challenges with low attack success rates (ASRs) when limited to fewer than ten queries per example. Recent approaches have improved ASRs due to the transferability of initial perturbations, yet they still suffer from inefficient querying. Our study introduces the Gradient-Aligned Attack (GAA) to enhance ASRs with minimal perturbation by focusing on the model's preference. We define a preference property where the generated adversarial example prefers to be misclassified as the wrong category with a high initial confidence. This property is further elucidated by the gradient preference, suggesting a positive correlation between the magnitude of a coefficient in a partial derivative and the norm of the derivative itself. Utilizing this, we devise the gradient-aligned CE (GACE) loss to precisely estimate gradients by aligning these coefficients between the surrogate and victim models, with coefficients assessed by the victim model's outputs. GAA, based on the GACE loss, also aims to achieve the smallest perturbation. Our tests on ImageNet, CIFAR10, and Imagga API show that GAA can increase ASRs by 25.7% and 40.3% for untargeted and targeted attacks respectively, while only needing minimally disruptive perturbations. Furthermore, the GACE loss reduces the number of necessary queries by up to 2.5x and enhances the transferability of advanced attacks by up to 14.2%, especially when using an ensemble surrogate model. Code is available at https://github.com/HaloMoto/GradientAlignedAttack.

源语言英语
文章编号121013
期刊Information Sciences
678
DOI
出版状态已出版 - 9月 2024

学术指纹

探究 'Improving query efficiency of black-box attacks via the preference of deep learning models' 的科研主题。它们共同构成独一无二的指纹。

引用此