Multi-objective deep reinforcement learning for crowd-aware robot navigation with dynamic human preference

  • Guangran Cheng
  • , Yuanda Wang
  • , Lu Dong
  • , Wenzhe Cai
  • , Changyin Sun

Research output: Contribution to journalArticlepeer-review

10 Scopus citations

Abstract

The growing development of autonomous systems is driving the application of mobile robots in crowded environments. These scenarios often require robots to satisfy multiple conflicting objectives with different relative preferences, such as work efficiency, safety, and smoothness, which inherently cause robots’ poor exploration in seeking policies optimizing several performance criteria. In this paper, we propose a multi-objective deep reinforcement learning framework for crowd-aware robot navigation problems to learn policies over multiple competing objectives whose relative importance preference is dynamic to the robot. First, a two-stream structure is introduced to separately extract the spatial and temporal features of pedestrian motion characteristics. Second, to learn navigation policies for each possible preference, a multi-objective deep reinforcement learning method is proposed to maximize a weighted-sum scalarization of different objective functions. We consider path planning and path tracking tasks, which focus on conflicting objectives of collision avoidance, target reaching, and path following. Experimental results demonstrate that our method can effectively navigate through crowds in simulated environments while satisfying different task requirements.

Original languageEnglish
Pages (from-to)16247-16265
Number of pages19
JournalNeural Computing and Applications
Volume35
Issue number22
DOIs
StatePublished - Aug 2023
Externally publishedYes

Keywords

  • Crowd-aware navigation
  • Mobile robot
  • Multi-objective deep reinforcement learning
  • Path planning
  • Path tracking

Fingerprint

Dive into the research topics of 'Multi-objective deep reinforcement learning for crowd-aware robot navigation with dynamic human preference'. Together they form a unique fingerprint.

Cite this