论文标题
在Flatmobiles中驾驶:从单眼相机进行整体轨迹计划的鸟眼视图占用网格
Driving among Flatmobiles: Bird-Eye-View occupancy grids from a monocular camera for holistic trajectory planning
论文作者
论文摘要
基于摄像机的端到端驱动神经网络带来了低成本系统的承诺,该系统将相机图像映射到驱动控制命令。这些网络之所以吸引人,是因为它们取代了费力的手工设计的构件,但是它们的黑盒子本质使它们在失败时很难探究。最近的工作表明,使用明确的中间表示的重要性,该表示具有提高网络决策的可解释性和准确性的好处。尽管如此,这些基于相机的网络的原因是相机视图中的比例不是均匀的,因此不直接适合运动预测。在本文中,我们介绍了一种新颖的单眼摄像头端到端轨迹计划网络,其鸟眼(BEV)中间表示形式以二元占用网格映射(OGMS)的形式出现。为了从相机图像中简化BEV中OGM的预测,我们引入了一种新颖的方案,在该方案中首先将OGMS预测为相机视图中的语义面具,然后在BEV中使用两个平面之间的同型扭曲。允许将这种转换应用于车辆等3D对象的关键要素包括仅预测其在摄像头视图中的足迹,因此尊重同型植物所隐含的平坦世界假设。
Camera-based end-to-end driving neural networks bring the promise of a low-cost system that maps camera images to driving control commands. These networks are appealing because they replace laborious hand engineered building blocks but their black-box nature makes them difficult to delve in case of failure. Recent works have shown the importance of using an explicit intermediate representation that has the benefits of increasing both the interpretability and the accuracy of networks' decisions. Nonetheless, these camera-based networks reason in camera view where scale is not homogeneous and hence not directly suitable for motion forecasting. In this paper, we introduce a novel monocular camera-only holistic end-to-end trajectory planning network with a Bird-Eye-View (BEV) intermediate representation that comes in the form of binary Occupancy Grid Maps (OGMs). To ease the prediction of OGMs in BEV from camera images, we introduce a novel scheme where the OGMs are first predicted as semantic masks in camera view and then warped in BEV using the homography between the two planes. The key element allowing this transformation to be applied to 3D objects such as vehicles, consists in predicting solely their footprint in camera-view, hence respecting the flat world hypothesis implied by the homography.