通过选择性自我监督自我训练来改善对象检测

论文标题

通过选择性自我监督自我训练来改善对象检测

Improving Object Detection with Selective Self-supervised Self-training

论文作者

Li, Yandong, Huang, Di, Qin, Danfeng, Wang, Liqiang, Gong, Boqing

论文摘要

我们研究如何利用Web图像来增强人类策划的对象检测数据集。我们的方法是两管齐的。一方面，我们通过图像到图像搜索检索Web图像，这比其他搜索方法更少，从策展数据中转移的域则更少。 Web图像是多样的，提供了各种各样的对象姿势，外观，它们与上下文等。另一方面，我们提出了一种新的学习方法，该方法是由两种平行的工作线促进的新型学习方法，这些方法探索了图像分类的未标记数据：自我训练和自我监督的学习。由于Web图像和策展数据集之间的域间隙，他们无法改善其香草形式的对象探测器。为了应对这一挑战，我们提出了一个选择性网，以纠正Web图像中的监督信号。它不仅可以识别正界框，而且还为开采硬负箱的矿区创造了一个安全的区域。我们报告了最先进的结果，以从日常场景中检测背包和椅子以及其他具有挑战性的对象类。

We study how to leverage Web images to augment human-curated object detection datasets. Our approach is two-pronged. On the one hand, we retrieve Web images by image-to-image search, which incurs less domain shift from the curated data than other search methods. The Web images are diverse, supplying a wide variety of object poses, appearances, their interactions with the context, etc. On the other hand, we propose a novel learning method motivated by two parallel lines of work that explore unlabeled data for image classification: self-training and self-supervised learning. They fail to improve object detectors in their vanilla forms due to the domain gap between the Web images and curated datasets. To tackle this challenge, we propose a selective net to rectify the supervision signals in Web images. It not only identifies positive bounding boxes but also creates a safe zone for mining hard negative boxes. We report state-of-the-art results on detecting backpacks and chairs from everyday scenes, along with other challenging object classes.

下载PDF全文

下载文献需遵守相关版权规定

论文标题