论文标题
增强对移动机器人的连续控制,以进行端到端视觉活动跟踪
Enhancing Continuous Control of Mobile Robots for End-to-End Visual Active Tracking
论文作者
论文摘要
在过去的几十年中,视觉目标跟踪一直是机器人研究界的主要研究兴趣之一。深度学习技术的最新进展使对视觉跟踪方法的开发有效,并且在各种应用中可能有效,从汽车到监视和人力援助。但是,大多数现有作品专门集中于被动视觉跟踪,即,通过假设不能采取任何措施将相机位置适应跟踪实体的运动,从而在图像序列中跟踪元素。相反,在这项工作中,我们解决了视觉主动跟踪,在该跟踪器中必须积极搜索并跟踪指定的目标。当前的最新方法使用深度加固学习(DRL)技术以端到端的方式解决该问题。但是,出现了两个主要问题:i)大多数贡献仅着眼于离散的动作空间,而考虑连续控制的贡献则无法达到相同的性能; ii)如果不正确调整,DRL模型可能会挑战训练,从而导致学习进展缓慢和最终表现不佳。为了应对这些挑战,我们提出了一种基于DRL的新型视觉活动跟踪系统,该系统提供连续的动作策略。为了加速培训并提高整体表现,我们引入了其他目标功能和启发式轨迹生成器(HTG),以促进学习。通过广泛的实验,我们表明我们的方法可以达到并超过其他最先进的方法,并证明即使在模拟中受过训练,它也可以在实际情况下成功执行视觉主动跟踪。
In the last decades, visual target tracking has been one of the primary research interests of the Robotics research community. The recent advances in Deep Learning technologies have made the exploitation of visual tracking approaches effective and possible in a wide variety of applications, ranging from automotive to surveillance and human assistance. However, the majority of the existing works focus exclusively on passive visual tracking, i.e., tracking elements in sequences of images by assuming that no actions can be taken to adapt the camera position to the motion of the tracked entity. On the contrary, in this work, we address visual active tracking, in which the tracker has to actively search for and track a specified target. Current State-of-the-Art approaches use Deep Reinforcement Learning (DRL) techniques to address the problem in an end-to-end manner. However, two main problems arise: i) most of the contributions focus only on discrete action spaces and the ones that consider continuous control do not achieve the same level of performance; and ii) if not properly tuned, DRL models can be challenging to train, resulting in a considerably slow learning progress and poor final performance. To address these challenges, we propose a novel DRL-based visual active tracking system that provides continuous action policies. To accelerate training and improve the overall performance, we introduce additional objective functions and a Heuristic Trajectory Generator (HTG) to facilitate learning. Through an extensive experimentation, we show that our method can reach and surpass other State-of-the-Art approaches performances, and demonstrate that, even if trained exclusively in simulation, it can successfully perform visual active tracking even in real scenarios.