论文标题
从空中图像中进行单眼深度估算的自学学习
Self-Supervised Learning for Monocular Depth Estimation from Aerial Imagery
论文作者
论文摘要
基于学习的单眼深度估计方法通常需要大量大量注释的培训数据。在空中图像的情况下,这种地面真相尤其难以获取。因此,在本文中,我们提出了一种自我监督学习的方法,以从空中图像中进行单眼深度估算,而无需带注释的训练数据。为此,我们仅使用来自单个移动摄像机的图像序列,并学会同时估计深度和姿势信息。通过共享姿势和深度估计之间的权重,我们实现了一个相对较小的模型,该模型有利于实时应用。我们在三个不同的数据集上评估了我们的方法,并将结果与传统方法进行比较,以估算基于多视图几何形状的深度图。我们达到的准确性δ1.25的精度高达93.5%。此外,我们特别关注训练有素的模型对未知数据和方法的自我改善能力的概括。我们得出的结论是,即使单眼深度估计的结果不如常规方法实现的结果,它们非常适合为依赖图像匹配或在图像匹配失败的区域中提供估计值的方法提供良好的初始化,例如。遮挡或无纹理区域。
Supervised learning based methods for monocular depth estimation usually require large amounts of extensively annotated training data. In the case of aerial imagery, this ground truth is particularly difficult to acquire. Therefore, in this paper, we present a method for self-supervised learning for monocular depth estimation from aerial imagery that does not require annotated training data. For this, we only use an image sequence from a single moving camera and learn to simultaneously estimate depth and pose information. By sharing the weights between pose and depth estimation, we achieve a relatively small model, which favors real-time application. We evaluate our approach on three diverse datasets and compare the results to conventional methods that estimate depth maps based on multi-view geometry. We achieve an accuracy δ1.25 of up to 93.5 %. In addition, we have paid particular attention to the generalization of a trained model to unknown data and the self-improving capabilities of our approach. We conclude that, even though the results of monocular depth estimation are inferior to those achieved by conventional methods, they are well suited to provide a good initialization for methods that rely on image matching or to provide estimates in regions where image matching fails, e.g. occluded or texture-less regions.