通过注意力的模块化：有效的培训和转移语言条件的机器人操纵政策

论文标题

通过注意力的模块化：有效的培训和转移语言条件的机器人操纵政策

Modularity through Attention: Efficient Training and Transfer of Language-Conditioned Policies for Robot Manipulation

论文作者

Zhou, Yifan, Sonawani, Shubham, Phielipp, Mariano, Stepputtis, Simon, Amor, Heni Ben

论文摘要

语言条件的政策允许机器人解释和执行人类指示。学习此类政策需要在时间和计算资源方面进行大量投资。尽管如此，最终的控制器是高度特定于设备的，并且不能轻易地转移到具有不同形态，能力，外观或动态的机器人中。在本文中，我们提出了一种培训语言条件的操纵策略的样本效率方法，该方法允许在不同类型的机器人之间快速转移。通过引入一种新颖的方法，即分层模块，并在多个子模块上采用监督的注意力，我们桥接了模块化和端到端学习之间的鸿沟，并使功能构建块的重复使用。在模拟和现实世界的机器人操纵实验中，我们证明我们的方法的表现优于当前的最新方法，并且可以以样品有效的方式传输4个不同机器人的策略。最后，我们表明，学到的子模块的功能是在训练过程之外维持的，并且可以用于内省机器人决策过程。代码可在https://github.com/ir-lab/modattn上找到。

Language-conditioned policies allow robots to interpret and execute human instructions. Learning such policies requires a substantial investment with regards to time and compute resources. Still, the resulting controllers are highly device-specific and cannot easily be transferred to a robot with different morphology, capability, appearance or dynamics. In this paper, we propose a sample-efficient approach for training language-conditioned manipulation policies that allows for rapid transfer across different types of robots. By introducing a novel method, namely Hierarchical Modularity, and adopting supervised attention across multiple sub-modules, we bridge the divide between modular and end-to-end learning and enable the reuse of functional building blocks. In both simulated and real world robot manipulation experiments, we demonstrate that our method outperforms the current state-of-the-art methods and can transfer policies across 4 different robots in a sample-efficient manner. Finally, we show that the functionality of learned sub-modules is maintained beyond the training process and can be used to introspect the robot decision-making process. Code is available at https://github.com/ir-lab/ModAttn.

下载PDF全文

下载文献需遵守相关版权规定

论文标题