跳到主要导航 跳到搜索 跳到主要内容

IS ADVERSARIAL TRAINING REALLY A SILVER BULLET FOR MITIGATING DATA POISONING?

  • Rui Wen
  • , Zhengyu Zhao
  • , Zhuoran Liu
  • , Michael Backes
  • , Tianhao Wang
  • , Yang Zhang
  • Helmholtz Center for Information Security
  • Radboud University Nijmegen
  • University of Virginia

科研成果: 会议稿件论文同行评审

20 引用 (Scopus)

摘要

Indiscriminate data poisoning can decrease the clean test accuracy of a deep learning model by slightly perturbing its training samples. There is a consensus that such poisons can hardly harm adversarially-trained (AT) models when the adversarial training budget is no less than the poison budget, i.e., ϵadv ≥ ϵpoi. This consensus, however, is challenged in this paper based on our new attack strategy that induces entangled features (EntF). The existence of entangled features makes the poisoned data become less useful for training a model, no matter if AT is applied or not. We demonstrate that for attacking a CIFAR-10 AT model under a reasonable setting with ϵadv = ϵpoi = 8/255, our EntF yields an accuracy drop of 13.31%, which is 7× better than existing methods and equal to discarding 83% training data. We further show the generalizability of EntF to more challenging settings, e.g., higher AT budgets, partial poisoning, unseen model architectures, and stronger (ensemble or adaptive) defenses. We finally provide new insights into the distinct roles of non-robust vs. robust features in poisoning standard vs. AT models and demonstrate the possibility of using a hybrid attack to poison standard and AT models simultaneously. Our code is available at https://github.com/WenRuiUSTC/EntF.

源语言英语
出版状态已出版 - 2023
已对外发布
活动11th International Conference on Learning Representations, ICLR 2023 - Kigali, 卢旺达
期限: 1 5月 20235 5月 2023

会议

会议11th International Conference on Learning Representations, ICLR 2023
国家/地区卢旺达
Kigali
时期1/05/235/05/23

学术指纹

探究 'IS ADVERSARIAL TRAINING REALLY A SILVER BULLET FOR MITIGATING DATA POISONING?' 的科研主题。它们共同构成独一无二的指纹。

引用此