论文标题
从图像收集到具有自我监督形状和姿势网络的点云
From Image Collections to Point Clouds with Self-supervised Shape and Pose Networks
论文作者
论文摘要
从2D图像重建3D模型是计算机视觉中的基本问题之一。在这项工作中,我们提出了一种从单个图像重建的3D对象重建的深度学习技术。与最近使用3D监督或多视图监督的最近作品相反,我们在培训期间也仅使用没有姿势信息的单个视图图像。这使我们的方法更实用,只需要对象类别和相应的轮廓的图像集合。我们以自我监督的方式学习3D点云的重建和构成估计网络,利用可区分的点云渲染器在2D监督下训练。所提出的技术的一个主要新颖性是通过以随机采样的姿势旋转它们,然后在3D重建和姿势上施加3D几何推理将其施加到预测的3D点云中。此外,使用单视图监督使我们能够在给定的测试图像上进行测试时间优化。在合成造型和现实World Pix3D数据集上进行的实验表明,与姿势抚养和多视图监督的方法相比,我们的方法可以实现竞争性能。
Reconstructing 3D models from 2D images is one of the fundamental problems in computer vision. In this work, we propose a deep learning technique for 3D object reconstruction from a single image. Contrary to recent works that either use 3D supervision or multi-view supervision, we use only single view images with no pose information during training as well. This makes our approach more practical requiring only an image collection of an object category and the corresponding silhouettes. We learn both 3D point cloud reconstruction and pose estimation networks in a self-supervised manner, making use of differentiable point cloud renderer to train with 2D supervision. A key novelty of the proposed technique is to impose 3D geometric reasoning into predicted 3D point clouds by rotating them with randomly sampled poses and then enforcing cycle consistency on both 3D reconstructions and poses. In addition, using single-view supervision allows us to do test-time optimization on a given test image. Experiments on the synthetic ShapeNet and real-world Pix3D datasets demonstrate that our approach, despite using less supervision, can achieve competitive performance compared to pose-supervised and multi-view supervised approaches.