TY - GEN
T1 - Membership Inference Attacks by Exploiting Loss Trajectory
AU - Liu, Yiyong
AU - Zhao, Zhengyu
AU - Backes, Michael
AU - Zhang, Yang
N1 - Publisher Copyright:
© 2022 ACM.
PY - 2022/11/7
Y1 - 2022/11/7
N2 - Machine learning models are vulnerable to membership inference attacks in which an adversary aims to predict whether or not a particular sample was contained in the target model's training dataset. Existing attack methods have commonly exploited the output information (mostly, losses) solely from the given target model. As a result, in practical scenarios where both the member and non-member samples yield similarly small losses, these methods are naturally unable to differentiate between them. To address this limitation, in this paper, we propose a new attack method, called TrajectoryMIA, which can exploit the membership information from the whole training process of the target model for improving the attack performance. To mount the attack in the common black-box setting, we leverage knowledge distillation, and represent the membership information by the losses evaluated on a sequence of intermediate models at different distillation epochs, namely distilled loss trajectory, together with the loss from the given target model. Experimental results over different datasets and model architectures demonstrate the great advantage of our attack in terms of different metrics. For example, on CINIC-10, our attack achieves at least 6 times higher true-positive rate at a low false-positive rate of 0.1% than existing methods. Further analysis demonstrates the general effectiveness of our attack in more strict scenarios.
AB - Machine learning models are vulnerable to membership inference attacks in which an adversary aims to predict whether or not a particular sample was contained in the target model's training dataset. Existing attack methods have commonly exploited the output information (mostly, losses) solely from the given target model. As a result, in practical scenarios where both the member and non-member samples yield similarly small losses, these methods are naturally unable to differentiate between them. To address this limitation, in this paper, we propose a new attack method, called TrajectoryMIA, which can exploit the membership information from the whole training process of the target model for improving the attack performance. To mount the attack in the common black-box setting, we leverage knowledge distillation, and represent the membership information by the losses evaluated on a sequence of intermediate models at different distillation epochs, namely distilled loss trajectory, together with the loss from the given target model. Experimental results over different datasets and model architectures demonstrate the great advantage of our attack in terms of different metrics. For example, on CINIC-10, our attack achieves at least 6 times higher true-positive rate at a low false-positive rate of 0.1% than existing methods. Further analysis demonstrates the general effectiveness of our attack in more strict scenarios.
KW - knowledge distillation
KW - loss trajectory
KW - membership inference
UR - https://www.scopus.com/pages/publications/85142824991
U2 - 10.1145/3548606.3560684
DO - 10.1145/3548606.3560684
M3 - 会议稿件
AN - SCOPUS:85142824991
T3 - Proceedings of the ACM Conference on Computer and Communications Security
SP - 2085
EP - 2098
BT - CCS 2022 - Proceedings of the 2022 ACM SIGSAC Conference on Computer and Communications Security
PB - Association for Computing Machinery
T2 - 28th ACM SIGSAC Conference on Computer and Communications Security, CCS 2022
Y2 - 7 November 2022 through 11 November 2022
ER -