TY - GEN
T1 - PEFTGuard
T2 - 46th IEEE Symposium on Security and Privacy, SP 2025
AU - Sun, Zhen
AU - Cong, Tianshuo
AU - Liu, Yule
AU - Lin, Chenhao
AU - He, Xinlei
AU - Chen, Rongmao
AU - Han, Xingshuo
AU - Huang, Xinyi
N1 - Publisher Copyright:
© 2025 IEEE.
PY - 2025
Y1 - 2025
N2 - Fine-tuning is an essential process to improve the performance of Large Language Models (LLMs) in specific domains, with Parameter-Efficient Fine-Tuning (PEFT) gaining popularity due to its capacity to reduce computational demands through the integration of low-rank adapters. These lightweight adapters, such as LoRA, can be shared and utilized on open-source platforms. However, adversaries could exploit this mechanism to inject backdoors into these adapters, resulting in malicious behaviors like incorrect or harmful outputs, which pose serious security risks to the community. Unfortunately, few current efforts concentrate on analyzing the backdoor patterns or detecting the backdoors in the adapters. To fill this gap, we first construct and release PADBench, a comprehensive benchmark that contains 13, 300 benign and backdoored adapters fine-tuned with various datasets, attack strategies, PEFT methods, and LLMs. Moreover, we propose PEFTGuard, the first backdoor detection framework against PEFT-based adapters. Extensive evaluation upon PADBench shows that PEFTGuard outperforms existing detection methods, achieving nearly perfect detection accuracy (100%) in most cases. Notably, PEFTGuard exhibits zero-shot transferability on three aspects, including different attacks, PEFT methods, and adapter ranks. In addition, we consider various adaptive attacks to demonstrate the high robustness of PEFTGuard. We further explore several possible backdoor mitigation defenses, finding fine-mixing to be the most effective method. We envision that our benchmark and method can shed light on future LLM backdoor detection research. 11Our code and dataset are available at: https://github.com/Vincent-HKUSTGZ/PEFTGuard.
AB - Fine-tuning is an essential process to improve the performance of Large Language Models (LLMs) in specific domains, with Parameter-Efficient Fine-Tuning (PEFT) gaining popularity due to its capacity to reduce computational demands through the integration of low-rank adapters. These lightweight adapters, such as LoRA, can be shared and utilized on open-source platforms. However, adversaries could exploit this mechanism to inject backdoors into these adapters, resulting in malicious behaviors like incorrect or harmful outputs, which pose serious security risks to the community. Unfortunately, few current efforts concentrate on analyzing the backdoor patterns or detecting the backdoors in the adapters. To fill this gap, we first construct and release PADBench, a comprehensive benchmark that contains 13, 300 benign and backdoored adapters fine-tuned with various datasets, attack strategies, PEFT methods, and LLMs. Moreover, we propose PEFTGuard, the first backdoor detection framework against PEFT-based adapters. Extensive evaluation upon PADBench shows that PEFTGuard outperforms existing detection methods, achieving nearly perfect detection accuracy (100%) in most cases. Notably, PEFTGuard exhibits zero-shot transferability on three aspects, including different attacks, PEFT methods, and adapter ranks. In addition, we consider various adaptive attacks to demonstrate the high robustness of PEFTGuard. We further explore several possible backdoor mitigation defenses, finding fine-mixing to be the most effective method. We envision that our benchmark and method can shed light on future LLM backdoor detection research. 11Our code and dataset are available at: https://github.com/Vincent-HKUSTGZ/PEFTGuard.
UR - https://www.scopus.com/pages/publications/105009324535
U2 - 10.1109/SP61157.2025.00161
DO - 10.1109/SP61157.2025.00161
M3 - 会议稿件
AN - SCOPUS:105009324535
T3 - Proceedings - IEEE Symposium on Security and Privacy
SP - 1713
EP - 1731
BT - Proceedings - 46th IEEE Symposium on Security and Privacy, SP 2025
A2 - Blanton, Marina
A2 - Enck, William
A2 - Nita-Rotaru, Cristina
PB - Institute of Electrical and Electronics Engineers Inc.
Y2 - 12 May 2025 through 15 May 2025
ER -