Option-based Multi-agent Exploration

  • Xuwei Song
  • , Lipeng Wan
  • , Zeyang Liu
  • , Xingyu Chen
  • , Xuguang Lan

Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

Abstract

Effective exploration is essential to cooperative multi-agent reinforcement learning (MARL). However, existing exploration MARL algorithms remain two challenges: enormous exploration space, and partial observability constraints. To address these challenges, we propose a method called option-based multiagent exploration (OMAE): we introduce the concept of option to reduce the number of decisions, where options are defined as policies with a termination condition. Option-based exploration improves learning efficiency because the option space is much smaller than the original policy space. We use a dual-policy framework to overcome partial observability constraints where the global state is not available in execution. Our framework separates the exploration and the exploitation policies to ensure that the exploitation policy is accessible to the state information without explicitly taking the options as input. We further introduce a likelihood estimation to solve the distribution shift problem between two policies. Experimental results show that the OMAE improves the coordinated ability in comparison with the baseline methods in most of the tasks in the StarCraftII environment(SMAC).

Original languageEnglish
Title of host publication2022 12th International Conference on CYBER Technology in Automation, Control, and Intelligent Systems, CYBER 2022
PublisherInstitute of Electrical and Electronics Engineers Inc.
Pages332-337
Number of pages6
ISBN (Electronic)9781665472678
DOIs
StatePublished - 2022
Event12th International Conference on CYBER Technology in Automation, Control, and Intelligent Systems, CYBER 2022 - Baishan, China
Duration: 27 Jul 202231 Jul 2022

Publication series

Name2022 12th International Conference on CYBER Technology in Automation, Control, and Intelligent Systems, CYBER 2022

Conference

Conference12th International Conference on CYBER Technology in Automation, Control, and Intelligent Systems, CYBER 2022
Country/TerritoryChina
CityBaishan
Period27/07/2231/07/22

Fingerprint

Dive into the research topics of 'Option-based Multi-agent Exploration'. Together they form a unique fingerprint.

Cite this