论文标题
E3D:基于事件的3D形状重建
E3D: Event-Based 3D Shape Reconstruction
论文作者
论文摘要
3D形状重建是增强/虚拟现实的主要组成部分。尽管高度先进,但基于RGB,RGB-D和LIDAR传感器的现有解决方案是功率和数据密集型,这引发了边缘设备部署的挑战。我们使用事件摄像头接近3D重建,该传感器具有明显较低的功率,延迟和数据费用,同时可以实现高动态范围。虽然以前基于事件的3D重建方法主要基于立体声视觉,但我们使用单眼事件摄像机将问题视为从轮廓上的多视图形状。移动事件摄像机的输出是一组稀疏的时空梯度集,在很大程度上绘制了场景/对象边缘和轮廓。我们首先引入了事件到silhouette(E2S)神经网络模块,以将一堆事件框架转换为相应的轮廓,并带有摄像头姿势回归的其他神经分支。其次,我们介绍了E3D,该E3D使用3D可区分的渲染器(PYTORCH3D)来实施跨视图3D网格一致性并微调E2S和姿势网络。最后,我们引入了3D到事件模拟管道,并将其应用于公开可用的对象数据集,并生成合成事件/剪影培训对,以进行监督学习。
3D shape reconstruction is a primary component of augmented/virtual reality. Despite being highly advanced, existing solutions based on RGB, RGB-D and Lidar sensors are power and data intensive, which introduces challenges for deployment in edge devices. We approach 3D reconstruction with an event camera, a sensor with significantly lower power, latency and data expense while enabling high dynamic range. While previous event-based 3D reconstruction methods are primarily based on stereo vision, we cast the problem as multi-view shape from silhouette using a monocular event camera. The output from a moving event camera is a sparse point set of space-time gradients, largely sketching scene/object edges and contours. We first introduce an event-to-silhouette (E2S) neural network module to transform a stack of event frames to the corresponding silhouettes, with additional neural branches for camera pose regression. Second, we introduce E3D, which employs a 3D differentiable renderer (PyTorch3D) to enforce cross-view 3D mesh consistency and fine-tune the E2S and pose network. Lastly, we introduce a 3D-to-events simulation pipeline and apply it to publicly available object datasets and generate synthetic event/silhouette training pairs for supervised learning.