SE（3）行动空间中的政策学习

论文标题

SE（3）行动空间中的政策学习

Policy learning in SE(3) action spaces

论文作者

Wang, Dian, Kohler, Colin, Platt, Robert

论文摘要

在空间动作表示中，动作空间跨越了机器人运动命令的目标空间，即SE（2）或SE（3）。这种方法已被用来解决挑战性的机器人操纵问题，并显示出希望。但是，该方法通常仅限于三维动作空间和短范围任务。本文提出了ASRSE3，这是一种处理较高维空间动作空间的新方法，该方法将具有高维动作空间的原始MDP转换为具有减少动作空间和增强状态空间的新MDP。我们还提出了SDQFD，这是为大型动作空间设计的DQFD的变体。 ASRSE3和SDQFD是在一组具有挑战性的块构造任务的背景下评估的。我们表明，这两种方法都超过了标准基线，并且可以在实际机器人系统上实践中使用。

In the spatial action representation, the action space spans the space of target poses for robot motion commands, i.e. SE(2) or SE(3). This approach has been used to solve challenging robotic manipulation problems and shows promise. However, the method is often limited to a three dimensional action space and short horizon tasks. This paper proposes ASRSE3, a new method for handling higher dimensional spatial action spaces that transforms an original MDP with high dimensional action space into a new MDP with reduced action space and augmented state space. We also propose SDQfD, a variation of DQfD designed for large action spaces. ASRSE3 and SDQfD are evaluated in the context of a set of challenging block construction tasks. We show that both methods outperform standard baselines and can be used in practice on real robotics systems.

下载PDF全文

下载文献需遵守相关版权规定

论文标题