机器人操纵的自然语言指令的空间推理

论文标题

机器人操纵的自然语言指令的空间推理

Spatial Reasoning from Natural Language Instructions for Robot Manipulation

论文作者

Venkatesh, Sagar Gubbi, Biswas, Anirban, Upadrashta, Raviteja, Srinivasan, Vikram, Talukdar, Partha, Amrutur, Bharadwaj

论文摘要

可以在非结构化环境中操纵对象并与人类合作的机器人可以通过理解自然语言受益匪浅。我们提出了两个阶段的管道架构，以对文本输入执行空间推理。场景中的所有对象首先是本地化的，然后用自然语言的机器人指示将局部坐标映射到与机器人必须分别拾取并放置对象的位置相对应的开始和结束坐标。我们表明，通过将其位置量化为二进制网格来表示局部对象，比将它们表示为2D坐标列表更可取。我们还表明，注意力可以提高概括，并可以克服数据集中的偏见。提出的方法用于使用机器人臂拾取扑克牌。

Robots that can manipulate objects in unstructured environments and collaborate with humans can benefit immensely by understanding natural language. We propose a pipelined architecture of two stages to perform spatial reasoning on the text input. All the objects in the scene are first localized, and then the instruction for the robot in natural language and the localized co-ordinates are mapped to the start and end co-ordinates corresponding to the locations where the robot must pick up and place the object respectively. We show that representing the localized objects by quantizing their positions to a binary grid is preferable to representing them as a list of 2D co-ordinates. We also show that attention improves generalization and can overcome biases in the dataset. The proposed method is used to pick-and-place playing cards using a robot arm.

下载PDF全文

下载文献需遵守相关版权规定

论文标题