TY - GEN
T1 - Multi-Agent Path Finding Method Based on Evolutionary Reinforcement Learning
AU - Shi, Qinru
AU - Liu, Meiqin
AU - Zhang, Senlin
AU - Zheng, Ronghao
AU - Lan, Xuguang
N1 - Publisher Copyright:
© 2024 Technical Committee on Control Theory, Chinese Association of Automation.
PY - 2024
Y1 - 2024
N2 - The multi-agent path finding (MAPF) problem is crucial to improve the efficiency of warehouse systems. Compared with traditional centralized methods, which encounter escalating computational complexities with increasing scale, reinforcement learning-based methods has been proven to be an effective method for solving MAPF problem. Nevertheless, in the complex and large-scale scenarios, the policies learned by existing reinforcement learning-based methods are generally inadequate to address the challenges effectively. By leveraging the concepts of policy evaluation and policy evolution, this paper aims to improve performance and sample efficiency. Consequently, we introduce an MAPF method based on evolutionary reinforcement learning. In particular, we design a collaborative policy network model based on reinforcement learning. Thereafter, a novel evolutionary reinforcement learning training framework is constructed. Through the quantitative evaluation mechanism, policy evaluation is carried out, and evolutionary algorithm is used for policy evolution, so that the collaborative policy could better guide the agent to complete the path finding task. We test on high-density warehouse environment instances of various map sizes, and the experimental results show that our method has high success rate and low average steps.
AB - The multi-agent path finding (MAPF) problem is crucial to improve the efficiency of warehouse systems. Compared with traditional centralized methods, which encounter escalating computational complexities with increasing scale, reinforcement learning-based methods has been proven to be an effective method for solving MAPF problem. Nevertheless, in the complex and large-scale scenarios, the policies learned by existing reinforcement learning-based methods are generally inadequate to address the challenges effectively. By leveraging the concepts of policy evaluation and policy evolution, this paper aims to improve performance and sample efficiency. Consequently, we introduce an MAPF method based on evolutionary reinforcement learning. In particular, we design a collaborative policy network model based on reinforcement learning. Thereafter, a novel evolutionary reinforcement learning training framework is constructed. Through the quantitative evaluation mechanism, policy evaluation is carried out, and evolutionary algorithm is used for policy evolution, so that the collaborative policy could better guide the agent to complete the path finding task. We test on high-density warehouse environment instances of various map sizes, and the experimental results show that our method has high success rate and low average steps.
KW - Multi-agent systems
KW - deep learning in robotics and automation
KW - evolutionary algorithm
KW - multi-agent path finding
KW - reinforcement learning
UR - https://www.scopus.com/pages/publications/85205523987
U2 - 10.23919/CCC63176.2024.10661475
DO - 10.23919/CCC63176.2024.10661475
M3 - 会议稿件
AN - SCOPUS:85205523987
T3 - Chinese Control Conference, CCC
SP - 5728
EP - 5733
BT - Proceedings of the 43rd Chinese Control Conference, CCC 2024
A2 - Na, Jing
A2 - Sun, Jian
PB - IEEE Computer Society
T2 - 43rd Chinese Control Conference, CCC 2024
Y2 - 28 July 2024 through 31 July 2024
ER -