论文标题

从点云中弱监督的3D对象检测

Weakly Supervised 3D Object Detection from Point Clouds

论文作者

Qin, Zengyi, Wang, Jinglu, Lu, Yan

论文摘要

场景理解中的至关重要的任务是3D对象检测,该检测旨在检测和本地化属于特定类的对象的3D边界框。现有的3D对象探测器在训练过程中很大程度上依赖注释的3D边界框,而这些注释可能是昂贵的,并且只有在有限的情况下才能访问。弱监督的学习是减少注释要求的一种有前途的方法,但是现有的弱监督对象检测器主要用于2D检测而不是3D。在这项工作中,我们提出了VS3D,这是一个从点云中弱监督的3D对象检测的框架,而无需使用任何地面真相3D边界训练框。首先,我们引入了一个无监督的3D提案模块,该模块通过利用归一化点云密度来生成对象建议。其次,我们提出了跨模式知识蒸馏策略,其中卷积神经网络通过查询图像数据集预处理的教师网络来预测3D对象建议的最终结果。有关挑战性Kitti数据集的全面实验证明了我们在各种评估设置中的VS3D的出色性能。源代码和预估计的模型可在https://github.com/zengyi-qin/weakly-supervise-3d-object-detection上公开获得。

A crucial task in scene understanding is 3D object detection, which aims to detect and localize the 3D bounding boxes of objects belonging to specific classes. Existing 3D object detectors heavily rely on annotated 3D bounding boxes during training, while these annotations could be expensive to obtain and only accessible in limited scenarios. Weakly supervised learning is a promising approach to reducing the annotation requirement, but existing weakly supervised object detectors are mostly for 2D detection rather than 3D. In this work, we propose VS3D, a framework for weakly supervised 3D object detection from point clouds without using any ground truth 3D bounding box for training. First, we introduce an unsupervised 3D proposal module that generates object proposals by leveraging normalized point cloud densities. Second, we present a cross-modal knowledge distillation strategy, where a convolutional neural network learns to predict the final results from the 3D object proposals by querying a teacher network pretrained on image datasets. Comprehensive experiments on the challenging KITTI dataset demonstrate the superior performance of our VS3D in diverse evaluation settings. The source code and pretrained models are publicly available at https://github.com/Zengyi-Qin/Weakly-Supervised-3D-Object-Detection.

扫码加入交流群

加入微信交流群

微信交流群二维码

扫码加入学术交流群,获取更多资源