论文标题
3D功能的轻巧集成以改进2D图像分割
Lightweight integration of 3D features to improve 2D image segmentation
论文作者
论文摘要
在过去的几年中,场景的理解取得了巨大的进步,因为数据采集系统现在提供了越来越多的各种模式数据(点云,深度,RGB ...)。但是,这种改进的计算资源和数据注释要求的成本很高。为了共同分析几何信息和图像,许多方法都取决于2D损失和3D损失,不仅需要每个像素标签2D,还需要3D每点标签。但是,获得3D地面图具有挑战性,耗时且容易出错。在本文中,我们表明,图像分割可以从3D几何信息中受益,而无需3D地面图,通过仅使用2D分割损失,以端到端的方式训练几何特征提取和2D分割网络。我们的方法首先使用轻质3D神经网络直接从提供的点云中提取3D特征的地图。然后将与RGB图像合并的3D特征映射用作经典图像分割网络的输入。我们的方法可以应用于许多2D分割网络,因为不需要3D地面图,因此仅通过边缘网络的重量增加和轻度输入数据集要求,可以显着提高其性能。
Scene understanding has made tremendous progress over the past few years, as data acquisition systems are now providing an increasing amount of data of various modalities (point cloud, depth, RGB...). However, this improvement comes at a large cost on computation resources and data annotation requirements. To analyze geometric information and images jointly, many approaches rely on both a 2D loss and 3D loss, requiring not only 2D per pixel-labels but also 3D per-point labels. However, obtaining a 3D groundtruth is challenging, time-consuming and error-prone. In this paper, we show that image segmentation can benefit from 3D geometric information without requiring a 3D groundtruth, by training the geometric feature extraction and the 2D segmentation network jointly, in an end-to-end fashion, using only the 2D segmentation loss. Our method starts by extracting a map of 3D features directly from a provided point cloud by using a lightweight 3D neural network. The 3D feature map, merged with the RGB image, is then used as an input to a classical image segmentation network. Our method can be applied to many 2D segmentation networks, improving significantly their performance with only a marginal network weight increase and light input dataset requirements, since no 3D groundtruth is required.