通过贝叶斯选择性融合的视觉位置识别的智能参考策划

论文标题

通过贝叶斯选择性融合的视觉位置识别的智能参考策划

Intelligent Reference Curation for Visual Place Recognition via Bayesian Selective Fusion

论文作者

Molloy, Timothy L., Fischer, Tobias, Milford, Michael, Nair, Girish N.

论文摘要

视觉场所识别（VPR）的一个主要挑战是尽管视觉外观发生了巨大变化，例如一天中的时间，季节，天气或照明条件，但视觉外观发生了巨大变化。许多基于深度学习图像描述符，序列匹配，域翻译和概率定位的方法在应对这一挑战方面取得了成功，但大多数依赖于可能位置的精心策划的代表性参考图像的可用性。在本文中，我们提出了一种新颖的方法，称为贝叶斯选择性融合，用于积极选择和融合信息丰富的参考图像，以确定给定查询图像的最佳位置匹配。我们方法的选择性元素避免了每个参考图像的适得其反融合，并可以在具有变化的视觉条件的环境中动态选择信息的参考图像（例如室内闪烁的灯光，在阳光照射期间或在日夜周期内的户外室内）。我们方法的概率元素提供了一种融合多个参考图像的方法，该图像通过新颖的VPR无训练的可能性功能来解释其不同的不确定性。在来自两个基准数据集的困难查询图像上，我们证明了我们的方法匹配并超过了几种替代融合方法的性能以及提供最佳参考图像的先验（不公平）知识的最新技术。我们的方法非常适合长期的机器人自主权，因为它是无训练的，描述符 - 静态的，并且可以补充诸如序列匹配之类的现有技术，因此动态视觉环境很普遍。

A key challenge in visual place recognition (VPR) is recognizing places despite drastic visual appearance changes due to factors such as time of day, season, weather or lighting conditions. Numerous approaches based on deep-learnt image descriptors, sequence matching, domain translation, and probabilistic localization have had success in addressing this challenge, but most rely on the availability of carefully curated representative reference images of the possible places. In this paper, we propose a novel approach, dubbed Bayesian Selective Fusion, for actively selecting and fusing informative reference images to determine the best place match for a given query image. The selective element of our approach avoids the counterproductive fusion of every reference image and enables the dynamic selection of informative reference images in environments with changing visual conditions (such as indoors with flickering lights, outdoors during sunshowers or over the day-night cycle). The probabilistic element of our approach provides a means of fusing multiple reference images that accounts for their varying uncertainty via a novel training-free likelihood function for VPR. On difficult query images from two benchmark datasets, we demonstrate that our approach matches and exceeds the performance of several alternative fusion approaches along with state-of-the-art techniques that are provided with prior (unfair) knowledge of the best reference images. Our approach is well suited for long-term robot autonomy where dynamic visual environments are commonplace since it is training-free, descriptor-agnostic, and complements existing techniques such as sequence matching.

下载PDF全文

下载文献需遵守相关版权规定

论文标题