DIFFSKILL：可通过工具的可变形物体操纵的可差物理学的技能抽象

论文标题

DIFFSKILL：可通过工具的可变形物体操纵的可差物理学的技能抽象

DiffSkill: Skill Abstraction from Differentiable Physics for Deformable Object Manipulations with Tools

论文作者

Lin, Xingyu, Huang, Zhiao, Li, Yunzhu, Tenenbaum, Joshua B., Held, David, Gan, Chuang

论文摘要

我们考虑使用工具对可变形对象进行连续机器人操纵的问题。先前的工作表明，可区分的物理模拟器为环境状态提供梯度，并有助于轨迹优化，以比无模型的强化学习算法更快地收敛数量级。但是，这种基于梯度的轨迹优化通常需要访问完整的模拟器状态，并且只能解决由于本地Optima而导致的短手，单个技能任务。在这项工作中，我们提出了一个名为diffskill的新型框架，该框架使用可区分的物理模拟器进行技能抽象来从感觉观察中求解长摩恩可变形的对象操纵任务。特别是，我们首先使用基于梯度的优化器中的单个工具来获得短距离技能，并使用可区分的模拟器中的完整状态信息；然后，我们从演示轨迹中学习一个神经技能摘要，该轨迹将RGBD图像作为输入。最后，我们通过找到中间目标，然后解决长途任务来计划技能。与以前的增强学习算法相比，我们在一组新的顺序变形对象操纵任务中显示了我们方法的优势，并与轨迹优化器进行了比较。

We consider the problem of sequential robotic manipulation of deformable objects using tools. Previous works have shown that differentiable physics simulators provide gradients to the environment state and help trajectory optimization to converge orders of magnitude faster than model-free reinforcement learning algorithms for deformable object manipulation. However, such gradient-based trajectory optimization typically requires access to the full simulator states and can only solve short-horizon, single-skill tasks due to local optima. In this work, we propose a novel framework, named DiffSkill, that uses a differentiable physics simulator for skill abstraction to solve long-horizon deformable object manipulation tasks from sensory observations. In particular, we first obtain short-horizon skills using individual tools from a gradient-based optimizer, using the full state information in a differentiable simulator; we then learn a neural skill abstractor from the demonstration trajectories which takes RGBD images as input. Finally, we plan over the skills by finding the intermediate goals and then solve long-horizon tasks. We show the advantages of our method in a new set of sequential deformable object manipulation tasks compared to previous reinforcement learning algorithms and compared to the trajectory optimizer.

下载PDF全文

下载文献需遵守相关版权规定

论文标题