论文标题
好奇心驱动的3D对象检测没有标签
Curiosity-driven 3D Object Detection Without Labels
论文作者
论文摘要
在本文中,我们着手从2D图像中求解6-DOF 3D对象检测的任务,其中唯一的监督是我们旨在查找的对象的几何表示。在此过程中,我们删除了对6多型标签(即位置,方向等)的需求,从而使我们的网络以自我监督的方式对未标记的图像进行培训。我们通过一个神经网络来实现这一目标,该神经网络学习一个明确的场景参数化,随后将其传递到一个可区分的渲染器中。我们分析了为什么使用可区分渲染对3D场景结构进行监督的分析损失是不切实际的,因为它几乎总是陷入了视觉模棱两可的局部最小值中。这可以通过一种新型的培训形式来克服,在这种培训中,采用额外的网络来引导优化本身来探索整个参数空间,即好奇,从而解决这些歧义并找到可行的最小值。
In this paper we set out to solve the task of 6-DOF 3D object detection from 2D images, where the only supervision is a geometric representation of the objects we aim to find. In doing so, we remove the need for 6-DOF labels (i.e., position, orientation etc.), allowing our network to be trained on unlabeled images in a self-supervised manner. We achieve this through a neural network which learns an explicit scene parameterization which is subsequently passed into a differentiable renderer. We analyze why analysis-by-synthesis-like losses for supervision of 3D scene structure using differentiable rendering is not practical, as it almost always gets stuck in local minima of visual ambiguities. This can be overcome by a novel form of training, where an additional network is employed to steer the optimization itself to explore the entire parameter space i.e., to be curious, and hence, to resolve those ambiguities and find workable minima.