TY - JOUR
T1 - SCC-rFMQ
T2 - a multiagent reinforcement learning method in cooperative Markov games with continuous actions
AU - Zhang, Chengwei
AU - Han, Zhuobing
AU - Liu, Bingfu
AU - Xue, Wanli
AU - Hao, Jianye
AU - Li, Xiaohong
AU - An, Dou
AU - Chen, Rong
N1 - Publisher Copyright:
© 2021, The Author(s), under exclusive licence to Springer-Verlag GmbH Germany, part of Springer Nature.
PY - 2022/7
Y1 - 2022/7
N2 - Although many multiagent reinforcement learning (MARL) methods have been proposed for learning the optimal solutions in continuous-action domains, multiagent cooperation domains with independent learners (ILs) have received relatively few investigations, especially in traditional RL domain. In this paper, we propose an sample based independent learning method, named Sample Continuous Coordination with recursive Frequency Maximum Q-Value (SCC-rFMQ), which divides the multiagent cooperative problem with continuous actions into two layers. The first layer samples a finite set of actions from the continuous action spaces by a re-sampling mechanism with variable exploratory rates, and the second layer evaluates the actions in the sampled action set and updates the policy using a reinforcement learning cooperative method. By constructing cooperative mechanisms at both levels, SCC-rFMQ can handle cooperative problems in continuous action cooperative Markov games effectively. The effectiveness of SCC-rFMQ is experimentally demonstrated on two well-designed games, i.e., a continuous version of the climbing game and a cooperative version of the boat problem. Experimental results show that SCC-rFMQ outperforms other reinforcement learning algorithms.
AB - Although many multiagent reinforcement learning (MARL) methods have been proposed for learning the optimal solutions in continuous-action domains, multiagent cooperation domains with independent learners (ILs) have received relatively few investigations, especially in traditional RL domain. In this paper, we propose an sample based independent learning method, named Sample Continuous Coordination with recursive Frequency Maximum Q-Value (SCC-rFMQ), which divides the multiagent cooperative problem with continuous actions into two layers. The first layer samples a finite set of actions from the continuous action spaces by a re-sampling mechanism with variable exploratory rates, and the second layer evaluates the actions in the sampled action set and updates the policy using a reinforcement learning cooperative method. By constructing cooperative mechanisms at both levels, SCC-rFMQ can handle cooperative problems in continuous action cooperative Markov games effectively. The effectiveness of SCC-rFMQ is experimentally demonstrated on two well-designed games, i.e., a continuous version of the climbing game and a cooperative version of the boat problem. Experimental results show that SCC-rFMQ outperforms other reinforcement learning algorithms.
KW - Continuous action space
KW - Cooperative Markov games
KW - Multiagent learning
KW - Reinforcement learning
UR - https://www.scopus.com/pages/publications/85123171337
U2 - 10.1007/s13042-021-01497-0
DO - 10.1007/s13042-021-01497-0
M3 - 文章
AN - SCOPUS:85123171337
SN - 1868-8071
VL - 13
SP - 1927
EP - 1944
JO - International Journal of Machine Learning and Cybernetics
JF - International Journal of Machine Learning and Cybernetics
IS - 7
ER -