论文标题
University-1652:用于基于无人机的地理位置的多视图多源基准
University-1652: A Multi-view Multi-source Benchmark for Drone-based Geo-localization
论文作者
论文摘要
我们考虑跨视图地理位置化的问题。该任务的主要挑战是学习强大的功能,以实现大量观点更改。现有的基准可以帮助,但在观点数量上受到限制。通常提供包含两个观点的图像对,例如卫星和地面,这可能会损害特征学习。除了电话摄像机和卫星外,我们认为无人机可以作为解决地理定位问题的第三个平台。与传统的地面视图图像相反,无人机视图图像遇到了更少的障碍物,例如树木,在目标地方飞行时可以提供全面的视图。为了验证无人机平台的有效性,我们为基于无人机的地理位置化的新型多视图多源基准,名为University-1652。 University-1652包含来自三个平台的数据,即全球1,652座大学建筑物的合成无人机,卫星和地面摄像头。据我们所知,University-1652是第一个基于无人机的地理位置数据集,可以实现两个新任务,即无人机视图目标定位和无人机导航。顾名思义,无人机视图目标定位旨在通过无人机视图图像预测目标位置的位置。另一方面,给定卫星视图查询图像,无人机导航是将无人机驱动到查询中感兴趣的区域。我们使用此数据集分析了各种现成的CNN功能,并在此具有挑战性的数据集上提出了强大的CNN基线。实验表明,University-1652有助于模型学习视点不变的功能,并且在现实世界中也具有良好的概括能力。
We consider the problem of cross-view geo-localization. The primary challenge of this task is to learn the robust feature against large viewpoint changes. Existing benchmarks can help, but are limited in the number of viewpoints. Image pairs, containing two viewpoints, e.g., satellite and ground, are usually provided, which may compromise the feature learning. Besides phone cameras and satellites, in this paper, we argue that drones could serve as the third platform to deal with the geo-localization problem. In contrast to the traditional ground-view images, drone-view images meet fewer obstacles, e.g., trees, and could provide a comprehensive view when flying around the target place. To verify the effectiveness of the drone platform, we introduce a new multi-view multi-source benchmark for drone-based geo-localization, named University-1652. University-1652 contains data from three platforms, i.e., synthetic drones, satellites and ground cameras of 1,652 university buildings around the world. To our knowledge, University-1652 is the first drone-based geo-localization dataset and enables two new tasks, i.e., drone-view target localization and drone navigation. As the name implies, drone-view target localization intends to predict the location of the target place via drone-view images. On the other hand, given a satellite-view query image, drone navigation is to drive the drone to the area of interest in the query. We use this dataset to analyze a variety of off-the-shelf CNN features and propose a strong CNN baseline on this challenging dataset. The experiments show that University-1652 helps the model to learn the viewpoint-invariant features and also has good generalization ability in the real-world scenario.