TY - JOUR
T1 - Multi-UAV Cooperative Pursuit Planning via Communication-Aware Multi-Agent Reinforcement Learning
AU - Ren, Haojie
AU - Han, Chunlei
AU - Pan, Hao
AU - Sun, Jianjun
AU - Li, Shuanglin
AU - An, Dou
AU - Hu, Kunhao
N1 - Publisher Copyright:
© 2025 by the authors.
PY - 2025/11
Y1 - 2025/11
N2 - Cooperative pursuit using multi-UAV systems presents significant challenges in dynamic task allocation, real-time coordination, and trajectory optimization within complex environments. To address these issues, this paper proposes a reinforcement learning-based task planning framework that employs a distributed Actor–Critic architecture enhanced with bidirectional recurrent neural networks (BRNN). The pursuit–evasion scenario is modeled as a multi-agent Markov decision process, enabling each UAV to make informed decisions based on shared observations and coordinated strategies. A multi-stage reward function and a BRNN-driven communication mechanism are introduced to improve inter-agent collaboration and learning stability. Extensive simulations across various deployment scenarios, including 3-vs-1 and 5-vs-2 configurations, demonstrate that the proposed method achieves a success rate of at least 90% and reduces the average capture time by at least 19% compared to rule-based baselines, confirming its superior effectiveness, robustness, and scalability in cooperative pursuit missions.
AB - Cooperative pursuit using multi-UAV systems presents significant challenges in dynamic task allocation, real-time coordination, and trajectory optimization within complex environments. To address these issues, this paper proposes a reinforcement learning-based task planning framework that employs a distributed Actor–Critic architecture enhanced with bidirectional recurrent neural networks (BRNN). The pursuit–evasion scenario is modeled as a multi-agent Markov decision process, enabling each UAV to make informed decisions based on shared observations and coordinated strategies. A multi-stage reward function and a BRNN-driven communication mechanism are introduced to improve inter-agent collaboration and learning stability. Extensive simulations across various deployment scenarios, including 3-vs-1 and 5-vs-2 configurations, demonstrate that the proposed method achieves a success rate of at least 90% and reduces the average capture time by at least 19% compared to rule-based baselines, confirming its superior effectiveness, robustness, and scalability in cooperative pursuit missions.
KW - bidirectional recurrent neural network
KW - cooperative pursuit
KW - multi-agent coordination
KW - multi-UAV systems
KW - reinforcement learning
UR - https://www.scopus.com/pages/publications/105023137135
U2 - 10.3390/aerospace12110993
DO - 10.3390/aerospace12110993
M3 - 文章
AN - SCOPUS:105023137135
SN - 2226-4310
VL - 12
JO - Aerospace
JF - Aerospace
IS - 11
M1 - 993
ER -