DSGN：3D对象检测的深立体声几何网络

论文标题

DSGN：3D对象检测的深立体声几何网络

DSGN: Deep Stereo Geometry Network for 3D Object Detection

论文作者

Chen, Yilun, Liu, Shu, Shen, Xiaoyong, Jia, Jiaya

论文摘要

大多数最先进的3D对象检测器都在很大程度上依赖于激光雷达传感器，因为基于图像的基于图像和基于LIDAR的方法之间存在较大的性能差距。它是由在3D方案中形成预测表示的方式引起的。我们的方法称为“深立体声几何网络（DSGN）”，通过检测到可区分体积表示的3D对象-3D几何体积 - 有效地编码3D常规空间的3D几何结构，从而大大降低了这一差距。通过此表示，我们同时学习深度信息和语义提示。我们首次提供了一个简单有效的基于立体声的3D检测管道，该管道共同估算深度并以端到端的学习方式检测3D对象。我们的方法的表现优于以前基于立体声的3D检测器（在AP方面高约10个），甚至通过在KITTI 3D对象检测排行榜上的几种基于激光雷达的方法来实现可比性的性能。我们的代码可在https://github.com/chenyilun95/dsgn上公开获取。

Most state-of-the-art 3D object detectors heavily rely on LiDAR sensors because there is a large performance gap between image-based and LiDAR-based methods. It is caused by the way to form representation for the prediction in 3D scenarios. Our method, called Deep Stereo Geometry Network (DSGN), significantly reduces this gap by detecting 3D objects on a differentiable volumetric representation -- 3D geometric volume, which effectively encodes 3D geometric structure for 3D regular space. With this representation, we learn depth information and semantic cues simultaneously. For the first time, we provide a simple and effective one-stage stereo-based 3D detection pipeline that jointly estimates the depth and detects 3D objects in an end-to-end learning manner. Our approach outperforms previous stereo-based 3D detectors (about 10 higher in terms of AP) and even achieves comparable performance with several LiDAR-based methods on the KITTI 3D object detection leaderboard. Our code is publicly available at https://github.com/chenyilun95/DSGN.

下载PDF全文

下载文献需遵守相关版权规定

论文标题