跳到主要导航 跳到搜索 跳到主要内容

Black-box Adversarial Attacks on Commercial Speech Platforms with Minimal Information

  • Baolin Zheng
  • , Peipei Jiang
  • , Qian Wang
  • , Qi Li
  • , Chao Shen
  • , Cong Wang
  • , Yunjie Ge
  • , Qingyang Teng
  • , Shenyi Zhang
  • Wuhan University
  • Tsinghua University
  • City University of Hong Kong

科研成果: 书/报告/会议事项章节会议稿件同行评审

89 引用 (Scopus)

摘要

Adversarial attacks against commercial black-box speech platforms, including cloud speech APIs and voice control devices, have received little attention until recent years. Constructing such attacks is difficult mainly due to the unique characteristics of time-domain speech signals and the much more complex architecture of acoustic systems. The current "black-box"attacks all heavily rely on the knowledge of prediction/confidence scores or other probability information to craft effective adversarial examples (AEs), which can be intuitively defended by service providers without returning these messages. In this paper, we take one more step forward and propose two novel adversarial attacks in more practical and rigorous scenarios. For commercial cloud speech APIs, we propose Occam, a decision-only black-box adversarial attack, where only final decisions are available to the adversary. In Occam, we formulate the decision-only AE generation as a discontinuous large-scale global optimization problem, and solve it by adaptively decomposing this complicated problem into a set of sub-problems and cooperatively optimizing each one. Our Occam is a one-size-fits-all approach, which achieves 100% success rates of attacks (SRoA) with an average SNR of 14.23dB, on a wide range of popular speech and speaker recognition APIs, including Google, Alibaba, Microsoft, Tencent, iFlytek, and Jingdong, outperforming the state-of-the-art black-box attacks. For commercial voice control devices, we propose NI-Occam, the first non-interactive physical adversarial attack, where the adversary does not need to query the oracle and has no access to its internal information and training data. We, for the first time, combine adversarial attacks with model inversion attacks, and thus generate the physically-effective audio AEs with high transferability without any interaction with target devices. Our experimental results show that NI-Occam can successfully fool Apple Siri, Microsoft Cortana, Google Assistant, iFlytek and Amazon Echo with an average SRoA of 52% and SNR of 9.65dB, shedding light on non-interactive physical attacks against voice control devices.

源语言英语
主期刊名CCS 2021 - Proceedings of the 2021 ACM SIGSAC Conference on Computer and Communications Security
出版商Association for Computing Machinery
86-107
页数22
ISBN(电子版)9781450384544
DOI
出版状态已出版 - 13 11月 2021
活动27th ACM Annual Conference on Computer and Communication Security, CCS 2021 - Virtual, Online, 韩国
期限: 15 11月 202119 11月 2021

出版系列

姓名Proceedings of the ACM Conference on Computer and Communications Security
ISSN(印刷版)1543-7221

会议

会议27th ACM Annual Conference on Computer and Communication Security, CCS 2021
国家/地区韩国
Virtual, Online
时期15/11/2119/11/21

学术指纹

探究 'Black-box Adversarial Attacks on Commercial Speech Platforms with Minimal Information' 的科研主题。它们共同构成独一无二的指纹。

引用此