对物体解开的深度对抗强化学习

论文标题

对物体解开的深度对抗强化学习

Deep Adversarial Reinforcement Learning for Object Disentangling

论文作者

Laux, Melvin, Arenz, Oleg, Peters, Jan, Pajarinen, Joni

论文摘要

深入学习与改进的训练技术和高计算能力相结合，导致了强化学习领域的最新进展（RL）和成功的机器人RL应用，例如手持操作。但是，大多数机器人RL依赖于众所周知的初始状态分布。在实际任务中，这些信息通常不可用。例如，当解开废物对象时，机器人W.R.T. \的实际位置可能与对RL策略的位置不匹配。为了解决这个问题，我们提出了一种新颖的对抗强化学习（ARL）框架。 ARL框架利用了一个对手，该对手经过训练，可以将原始代理人（主角）引导到具有挑战性的国家。我们共同训练主角和对手，以使他们适应对手不断变化的政策。我们表明，我们的方法可以通过训练机器人控制端到端系统来概括到测试方案，以解决具有挑战性的对象分解任务。使用KUKA LBR+ 7-DOF机器人组进行的实验表明，我们的方法在从不同的初始状态开始时，与训练期间所提供的初始状态相比，我们的方法的表现优于基线方法。

Deep learning in combination with improved training techniques and high computational power has led to recent advances in the field of reinforcement learning (RL) and to successful robotic RL applications such as in-hand manipulation. However, most robotic RL relies on a well known initial state distribution. In real-world tasks, this information is however often not available. For example, when disentangling waste objects the actual position of the robot w.r.t.\ the objects may not match the positions the RL policy was trained for. To solve this problem, we present a novel adversarial reinforcement learning (ARL) framework. The ARL framework utilizes an adversary, which is trained to steer the original agent, the protagonist, to challenging states. We train the protagonist and the adversary jointly to allow them to adapt to the changing policy of their opponent. We show that our method can generalize from training to test scenarios by training an end-to-end system for robot control to solve a challenging object disentangling task. Experiments with a KUKA LBR+ 7-DOF robot arm show that our approach outperforms the baseline method in disentangling when starting from different initial states than provided during training.

下载PDF全文

下载文献需遵守相关版权规定

论文标题