广义类别发现

论文标题

广义类别发现

Generalized Category Discovery

论文作者

Vaze, Sagar, Han, Kai, Vedaldi, Andrea, Zisserman, Andrew

论文摘要

在本文中，我们考虑了一个高度一般的图像识别设置，其中给定标记和未标记的图像集，任务是对未标记集合中的所有图像进行分类。在这里，未标记的图像可能来自标记的类或新颖的图像。现有的识别方法无法处理此设置，因为它们做出了几个限制性假设，例如仅来自已知或未知类别的未标记实例，以及已知的未知类别的数量。我们将其命名为“广义类别发现”，并挑战所有这些假设。我们首先通过从新颖类别发现中采用最先进的算法来建立强大的基线，并将其适应此任务。接下来，我们建议在此开放世界环境中使用视觉变压器具有对比度表示学习。然后，我们引入了一种简单而有效的半监督$ k $ -MEANS方法，将未标记的数据聚集到可见的和看不见的类中，从而显着优于基线。最后，我们还提出了一种新方法来估计未标记数据中的类数。我们在公共数据集上彻底评估了我们的方法，以进行通用对象分类和细粒度数据集，以利用最近的语义转移基准套件。项目页面https://www.robots.ox.ac.uk/~vgg/research/gcd

In this paper, we consider a highly general image recognition setting wherein, given a labelled and unlabelled set of images, the task is to categorize all images in the unlabelled set. Here, the unlabelled images may come from labelled classes or from novel ones. Existing recognition methods are not able to deal with this setting, because they make several restrictive assumptions, such as the unlabelled instances only coming from known - or unknown - classes, and the number of unknown classes being known a-priori. We address the more unconstrained setting, naming it 'Generalized Category Discovery', and challenge all these assumptions. We first establish strong baselines by taking state-of-the-art algorithms from novel category discovery and adapting them for this task. Next, we propose the use of vision transformers with contrastive representation learning for this open-world setting. We then introduce a simple yet effective semi-supervised $k$-means method to cluster the unlabelled data into seen and unseen classes automatically, substantially outperforming the baselines. Finally, we also propose a new approach to estimate the number of classes in the unlabelled data. We thoroughly evaluate our approach on public datasets for generic object classification and on fine-grained datasets, leveraging the recent Semantic Shift Benchmark suite. Project page at https://www.robots.ox.ac.uk/~vgg/research/gcd

下载PDF全文

下载文献需遵守相关版权规定

论文标题