TY - JOUR
T1 - Revisiting Gradient Regularization
T2 - Inject Robust Saliency-Aware Weight Bias for Adversarial Defense
AU - Li, Qian
AU - Hu, Qingyuan
AU - Lin, Chenhao
AU - Wu, Di
AU - Shen, Chao
N1 - Publisher Copyright:
© 2005-2012 IEEE.
PY - 2023
Y1 - 2023
N2 - Despite regularizing the Jacobians of neural networks to enhance model robustness has directly theoretical correlation with model prediction stability, a large defense performance gap exists when compared to the empirically perturbation-based adversarial training e.g. PGD-based, which enjoys nice discriminative saliency maps as well. To mitigate this issue, in this paper we first analyze the dilemma that the gradient map of its resulting model has no content hierarchy to mark out salient profile of input, as a negative signal of the obstructive for effective adversarial defense. Based on this, we argue that incorporating robust gradient-based saliency properties into regularized training may be helpful to reduce the performance gap. Specifically, we propose a simple method called Saliency-aware Gradient Regularization (SAGR), where a biased weight distribution strategy is introduced on positive gradient to structure and increase the impact of class-gradient components inside the Jacobian of model. The strategy maintains the dominant role of saliency-critical true-class gradient in learning process and differentiates diverse importance of gradient sensitivities that would localize input salient areas. Herein we interpret the sharpness of true-class sensitivity as robust recognition of more learning-relevant features e.g., regions containing dominant object in image for classification. Instead, false-class parts are considered as recognition-irrelevant nuisance factors e.g. The backgrounds, which are thus depressed with more strength. Experimental results demonstrate the efficacy of the proposed method and validate that distinguishment of sensitivities could further yield more robustness gain and sharper gradient saliency map.
AB - Despite regularizing the Jacobians of neural networks to enhance model robustness has directly theoretical correlation with model prediction stability, a large defense performance gap exists when compared to the empirically perturbation-based adversarial training e.g. PGD-based, which enjoys nice discriminative saliency maps as well. To mitigate this issue, in this paper we first analyze the dilemma that the gradient map of its resulting model has no content hierarchy to mark out salient profile of input, as a negative signal of the obstructive for effective adversarial defense. Based on this, we argue that incorporating robust gradient-based saliency properties into regularized training may be helpful to reduce the performance gap. Specifically, we propose a simple method called Saliency-aware Gradient Regularization (SAGR), where a biased weight distribution strategy is introduced on positive gradient to structure and increase the impact of class-gradient components inside the Jacobian of model. The strategy maintains the dominant role of saliency-critical true-class gradient in learning process and differentiates diverse importance of gradient sensitivities that would localize input salient areas. Herein we interpret the sharpness of true-class sensitivity as robust recognition of more learning-relevant features e.g., regions containing dominant object in image for classification. Instead, false-class parts are considered as recognition-irrelevant nuisance factors e.g. The backgrounds, which are thus depressed with more strength. Experimental results demonstrate the efficacy of the proposed method and validate that distinguishment of sensitivities could further yield more robustness gain and sharper gradient saliency map.
KW - Deep neural networks
KW - adversarial robustness
KW - gradient regularization
KW - saliency map
UR - https://www.scopus.com/pages/publications/85164400102
U2 - 10.1109/TIFS.2023.3289000
DO - 10.1109/TIFS.2023.3289000
M3 - 文章
AN - SCOPUS:85164400102
SN - 1556-6013
VL - 18
SP - 5936
EP - 5949
JO - IEEE Transactions on Information Forensics and Security
JF - IEEE Transactions on Information Forensics and Security
ER -