TY - GEN
T1 - A New Deep Reinforcement Learning Algorithm for UAV Swarm Confrontation Game
AU - Xie, Laicai
AU - Ma, Wanpeng
AU - Wang, Liping
AU - Ke, Liangjun
N1 - Publisher Copyright:
© The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2024.
PY - 2024
Y1 - 2024
N2 - UAV swarm confrontation game is a type of intelligent game problem. Multi-agent reinforcement learning theory provides an effective solution for this game. However, when using common multi-agent deep reinforcement learning algorithms, such as the multi-agent deep deterministic policy gradient (MADDPG) algorithm, to train the strategy of UAV swarm, there are issues such as slow convergence speed and weak generalization ability on similar tasks. To address these issues, this paper combines the model-agnostic meta-learning (MAML) algorithm in few-shot learning with the original MADDPG algorithm, and proposes an improved MB-MADDPG algorithm, which is applied to the strategy optimization of a UAV swarm confrontation task. Experimental results show that compared with the original algorithm, the improved algorithm can accelerate the convergence while maintaining the training effect, and the success rate of defense after training with both algorithms exceeds 50%.
AB - UAV swarm confrontation game is a type of intelligent game problem. Multi-agent reinforcement learning theory provides an effective solution for this game. However, when using common multi-agent deep reinforcement learning algorithms, such as the multi-agent deep deterministic policy gradient (MADDPG) algorithm, to train the strategy of UAV swarm, there are issues such as slow convergence speed and weak generalization ability on similar tasks. To address these issues, this paper combines the model-agnostic meta-learning (MAML) algorithm in few-shot learning with the original MADDPG algorithm, and proposes an improved MB-MADDPG algorithm, which is applied to the strategy optimization of a UAV swarm confrontation task. Experimental results show that compared with the original algorithm, the improved algorithm can accelerate the convergence while maintaining the training effect, and the success rate of defense after training with both algorithms exceeds 50%.
KW - Few-shot Learning
KW - MADDPG
KW - MAML
KW - Multi-agent Reinforcement Learning
KW - UAV Swarm Confrontation
UR - https://www.scopus.com/pages/publications/85187638213
U2 - 10.1007/978-981-97-0837-6_14
DO - 10.1007/978-981-97-0837-6_14
M3 - 会议稿件
AN - SCOPUS:85187638213
SN - 9789819708369
T3 - Communications in Computer and Information Science
SP - 199
EP - 210
BT - Data Mining and Big Data - 8th International Conference, DMBD 2023, Proceedings
A2 - Tan, Ying
A2 - Shi, Yuhui
PB - Springer Science and Business Media Deutschland GmbH
T2 - 8th International Conference on Data Mining and Big Data, DMBD 2023
Y2 - 9 December 2023 through 12 December 2023
ER -