从视频中无监督发现3D物理对象

论文标题

从视频中无监督发现3D物理对象

Unsupervised Discovery of 3D Physical Objects from Video

论文作者

Du, Yilun, Smith, Kevin, Ulman, Tomer, Tenenbaum, Joshua, Wu, Jiajun

论文摘要

我们研究无监督物理对象发现的问题。尽管现有的框架旨在将场景分解为基于每个对象的外观的2D段，但我们探讨了物理学，尤其是对象相互作用，如何以一种不受欢迎的方式促进3D几何和对象与视频的位置的分解。从发育心理学中汲取灵感，我们的物理对象发现网络（POD-NET）使用多尺度的像素提示和物理运动提示来准确段可观察到的不同大小的可观察和部分遮挡的对象，并推断这些对象的属性。我们的模型可靠地片段在合成场景和真实场景上。发现的对象属性也可以用于推理物理事件。

We study the problem of unsupervised physical object discovery. While existing frameworks aim to decompose scenes into 2D segments based off each object's appearance, we explore how physics, especially object interactions, facilitates disentangling of 3D geometry and position of objects from video, in an unsupervised manner. Drawing inspiration from developmental psychology, our Physical Object Discovery Network (POD-Net) uses both multi-scale pixel cues and physical motion cues to accurately segment observable and partially occluded objects of varying sizes, and infer properties of those objects. Our model reliably segments objects on both synthetic and real scenes. The discovered object properties can also be used to reason about physical events.

下载PDF全文

下载文献需遵守相关版权规定

论文标题