跳到主要导航 跳到搜索 跳到主要内容

Research on Complex Robot Manipulation Tasks Based on Hindsight Trust Region Policy Optimization

  • Xi'an Jiaotong University

科研成果: 书/报告/会议事项章节会议稿件同行评审

摘要

Deep reinforcement learning (DRL) algorithms have make remarkable progress in robot manipulation task in recent years. However, the success of completing the task relies heavily on the special design of reward function which requires engineering experience or domain-specific knowledge. To avoid complex reward shaping and make robot learning more general, it's of great essential to study the sparse-reward environments. In this paper, we present two types of challenging goal-conditioned sparse-reward tasks with 7-DoF robot arm, one is a target reaching task with obstacles, and the other is the dynamic object task where the target object moves at a certain speed. Based on the Hindsight Trust Region Policy Optimization (HTRPO) algorithm proposed by our research group, we studied the control performance on the two types of tasks with continuous high-dimensional state space. The results show that HTRPO can achieve more stable strategic performance, higher success rate and sample efficiency compared with its baseline algorithm TRPO and HPG. However, there still remains challenges in solving the tasks with high moving speed.

源语言英语
主期刊名Proceedings - 2020 Chinese Automation Congress, CAC 2020
出版商Institute of Electrical and Electronics Engineers Inc.
4541-4546
页数6
ISBN(电子版)9781728176871
DOI
出版状态已出版 - 6 11月 2020
活动2020 Chinese Automation Congress, CAC 2020 - Shanghai, 中国
期限: 6 11月 20208 11月 2020

出版系列

姓名Proceedings - 2020 Chinese Automation Congress, CAC 2020

会议

会议2020 Chinese Automation Congress, CAC 2020
国家/地区中国
Shanghai
时期6/11/208/11/20

学术指纹

探究 'Research on Complex Robot Manipulation Tasks Based on Hindsight Trust Region Policy Optimization' 的科研主题。它们共同构成独一无二的指纹。

引用此