Hierarchical Heterogeneous Multi-Agent Cross-Domain Search Method Based on Deep Reinforcement Learning

Research output: Contribution to journalArticlepeer-review

5 Scopus citations

Abstract

Marine target searching is a complex task due to large search areas, unique signal propagation characteristics, and limited visibility, posing significant challenges for single-agent or homogeneous multi-agent systems. In response, we propose a novel hierarchical heterogeneous multi-agent (HHMA) framework designed for underwater search scenarios. This framework integrates three types of vehicles moving in different domains - unmanned aerial, surface, and underwater vehicles, effectively overcoming the limitations of single or double-agent configurations. We begin by elucidating the advantages of the HHMA system in target searching, providing the kinematic modeling, while also transforming sonar detecting data and defining the search problem. The mission is decomposed to three human-comprehensible subtasks that are adaptive to both environmental conditions and equipment capabilities: moving, target estimating and trajectory planning. The target estimating subtask is effectively modeled as a Markov Decision Process, retaining its memory capability. Additionally, we extend multi-agent reinforcement learning to multi-policy reinforcement learning, facilitating the training of interdependent policies. The efficacy of our approach is demonstrated through simulations, comparing it with rule-based methods. Simulation results underscore the significance of the HHMA system and validate the proposed training methodology.

Original languageEnglish
Pages (from-to)18872-18883
Number of pages12
JournalIEEE Transactions on Intelligent Transportation Systems
Volume25
Issue number11
DOIs
StatePublished - 2024

Keywords

  • Hierarchical heterogeneous multi-agent
  • cross-domain
  • multi-policy reinforcement learning
  • target searching

Fingerprint

Dive into the research topics of 'Hierarchical Heterogeneous Multi-Agent Cross-Domain Search Method Based on Deep Reinforcement Learning'. Together they form a unique fingerprint.

Cite this