Perch 2.0：通过搜索对象姿势估计，快速准确的基于GPU的感知

论文标题

Perch 2.0：通过搜索对象姿势估计，快速准确的基于GPU的感知

PERCH 2.0 : Fast and Accurate GPU-based Perception via Search for Object Pose Estimation

论文作者

Agarwal, Aditya, Han, Yupeng, Likhachev, Maxim

论文摘要

已知对象的姿势估计是机器人抓握和操纵等任务的基础。对可靠抓握的需求对动态环境中混乱，遮挡的场景中的姿势估计施加了严格的准确性要求。现代方法采用大量培训数据来学习特征，以查找3D模型和观察到的数据之间的对应关系。但是，这些方法需要广泛的地面真理姿势注释。另一种选择是使用算法在可能的渲染场景的一个空间中搜索观察到的场景的最佳解释。最近开发的算法（通过搜索的感知）使用深度数据将使用特殊构造的树上的搜索收集到全球最佳解决方案来进行。尽管Perch可以很好地保证准确性，但由于其高运行时，当前的配方均具有低可伸缩性。另外，唯一对姿势估计深度数据的依赖将算法限制为没有两个对象具有相同形状的场景。在这项工作中，我们提出了Perch 2.0，这是一种通过搜索策略的新知觉，利用GPU加速和RGB数据。我们表明，与最先进的数据驱动的方法相比，我们的方法可以达到100倍的速度，并且可以在6-DOF姿势估计的情况下进行更高的准确性，而无需在培训数据中注释地面真相。我们的代码和视频可在https://sbpl-cruz.github.io/perception/上找到。

Pose estimation of known objects is fundamental to tasks such as robotic grasping and manipulation. The need for reliable grasping imposes stringent accuracy requirements on pose estimation in cluttered, occluded scenes in dynamic environments. Modern methods employ large sets of training data to learn features in order to find correspondence between 3D models and observed data. However these methods require extensive annotation of ground truth poses. An alternative is to use algorithms that search for the best explanation of the observed scene in a space of possible rendered scenes. A recently developed algorithm, PERCH (PErception Via SeaRCH) does so by using depth data to converge to a globally optimum solution using a search over a specially constructed tree. While PERCH offers strong guarantees on accuracy, the current formulation suffers from low scalability owing to its high runtime. In addition, the sole reliance on depth data for pose estimation restricts the algorithm to scenes where no two objects have the same shape. In this work, we propose PERCH 2.0, a novel perception via search strategy that takes advantage of GPU acceleration and RGB data. We show that our approach can achieve a speedup of 100x over PERCH, as well as better accuracy than the state-of-the-art data-driven approaches on 6-DoF pose estimation without the need for annotating ground truth poses in the training data. Our code and video are available at https://sbpl-cruz.github.io/perception/.

下载PDF全文

下载文献需遵守相关版权规定

论文标题