TY - GEN
T1 - Graph-QMIX
T2 - 37th Youth Academic Annual Conference of Chinese Association of Automation, YAC 2022
AU - Pan, Duoning
AU - An, Dou
AU - Zhang, Ruining
N1 - Publisher Copyright:
© 2022 IEEE.
PY - 2022
Y1 - 2022
N2 - In recent years, with the development of multiagent reinforcement learning, more and more complex tasks have been solved. However, today's multi-agent reinforcement learning faces two challenges: 1) the global state is always used to train the neural network, which is hard to obtain in the real-world; 2) compared to the global state, concatenating local observations decreases the performance of multi-agent reinforcement learning algorithms. These challenges make it difficult to apply multi-agent reinforcement learning algorithms in real-world scenarios. To solve these challenges, we proposed the Graph-QMIX algorithm, where all agents are seen as a graph, and the graph convolutional neural network is used to integrate the local observations of the agents. We evaluate our method in map 2s vs lsc and map 10m vs 11m of SMAC environment. Empirically simulation results show that our method reaches a strong performance as much as QMIX using the global state, and is much stronger than QMIX using the concatenating local observations.
AB - In recent years, with the development of multiagent reinforcement learning, more and more complex tasks have been solved. However, today's multi-agent reinforcement learning faces two challenges: 1) the global state is always used to train the neural network, which is hard to obtain in the real-world; 2) compared to the global state, concatenating local observations decreases the performance of multi-agent reinforcement learning algorithms. These challenges make it difficult to apply multi-agent reinforcement learning algorithms in real-world scenarios. To solve these challenges, we proposed the Graph-QMIX algorithm, where all agents are seen as a graph, and the graph convolutional neural network is used to integrate the local observations of the agents. We evaluate our method in map 2s vs lsc and map 10m vs 11m of SMAC environment. Empirically simulation results show that our method reaches a strong performance as much as QMIX using the global state, and is much stronger than QMIX using the concatenating local observations.
UR - https://www.scopus.com/pages/publications/85147899058
U2 - 10.1109/YAC57282.2022.10023781
DO - 10.1109/YAC57282.2022.10023781
M3 - 会议稿件
AN - SCOPUS:85147899058
T3 - Proceedings - 2022 37th Youth Academic Annual Conference of Chinese Association of Automation, YAC 2022
SP - 1275
EP - 1280
BT - Proceedings - 2022 37th Youth Academic Annual Conference of Chinese Association of Automation, YAC 2022
PB - Institute of Electrical and Electronics Engineers Inc.
Y2 - 19 November 2022 through 20 November 2022
ER -