主动视觉学习的语义好奇心

论文标题

主动视觉学习的语义好奇心

Semantic Curiosity for Active Visual Learning

论文作者

Chaplot, Devendra Singh, Jiang, Helen, Gupta, Saurabh, Gupta, Abhinav

论文摘要

在本文中，我们研究了用于对象检测的体现交互式学习的任务。给定一组环境（以及一些标签预算），我们的目标是通过让代理选择要获得标签的数据来学习对象检测器。勘探政策应该如何确定应该标记哪种轨迹？一种可能性是将训练有素的对象检测器的故障案例作为外部奖励。但是，这将需要标记培训RL政策所需的数百万帧，这是不可行的。取而代之的是，我们通过引入语义好奇心的概念来探索一种自我监督的方法来训练我们的探索政策。我们的语义好奇心政策基于一个简单的观察 - 检测输出应保持一致。因此，我们的语义好奇心奖励标签行为不一致，并鼓励探索政策探索此类领域。通过语义好奇心训练的探索政策概括了新的场景，并有助于训练对象探测器，该对象检测器的表现优于接受其他可能的替代方案训练的基准，例如随机探索，预测 - 错误的好奇心和覆盖范围最大化的探索。

In this paper, we study the task of embodied interactive learning for object detection. Given a set of environments (and some labeling budget), our goal is to learn an object detector by having an agent select what data to obtain labels for. How should an exploration policy decide which trajectory should be labeled? One possibility is to use a trained object detector's failure cases as an external reward. However, this will require labeling millions of frames required for training RL policies, which is infeasible. Instead, we explore a self-supervised approach for training our exploration policy by introducing a notion of semantic curiosity. Our semantic curiosity policy is based on a simple observation -- the detection outputs should be consistent. Therefore, our semantic curiosity rewards trajectories with inconsistent labeling behavior and encourages the exploration policy to explore such areas. The exploration policy trained via semantic curiosity generalizes to novel scenes and helps train an object detector that outperforms baselines trained with other possible alternatives such as random exploration, prediction-error curiosity, and coverage-maximizing exploration.

下载PDF全文

下载文献需遵守相关版权规定

论文标题