论文标题
扫描:学会对没有标签的图像进行分类
SCAN: Learning to Classify Images without Labels
论文作者
论文摘要
当缺乏地面真相注释时,我们可以自动将图像分为有意义的群集吗?无监督的图像分类的任务仍然是计算机视觉中的重要挑战。最近的几种方法试图以端到端的方式解决这个问题。在本文中,我们偏离了最近的作品,并提倡一种两步方法,其中特征学习和聚类被解耦。首先,采用表示学习的自我监督任务来获得语义上有意义的特征。其次,我们将获得的功能用作可学习聚类方法的先验。在此过程中,我们删除了群集学习的能力,以依赖于低级特征,而低级功能则存在于当前端到端学习方法中。实验评估表明,在CIFAR10上,我们的表现优于最先进的方法,尤其是 +26.6%,在CIFAR100-20上,STL10的 +25.0%在分类精度方面 +25.0%。此外,我们的方法是第一个在大型数据集上进行图像分类的方法。特别是,我们在不使用任何地面真实注释的情况下,在低数据表中获得了有希望的结果,并在低数据表格中胜过几种半监督学习方法。该代码可在https://github.com/wvangansbeke/unsupervise-classification上公开提供。
Can we automatically group images into semantically meaningful clusters when ground-truth annotations are absent? The task of unsupervised image classification remains an important, and open challenge in computer vision. Several recent approaches have tried to tackle this problem in an end-to-end fashion. In this paper, we deviate from recent works, and advocate a two-step approach where feature learning and clustering are decoupled. First, a self-supervised task from representation learning is employed to obtain semantically meaningful features. Second, we use the obtained features as a prior in a learnable clustering approach. In doing so, we remove the ability for cluster learning to depend on low-level features, which is present in current end-to-end learning approaches. Experimental evaluation shows that we outperform state-of-the-art methods by large margins, in particular +26.6% on CIFAR10, +25.0% on CIFAR100-20 and +21.3% on STL10 in terms of classification accuracy. Furthermore, our method is the first to perform well on a large-scale dataset for image classification. In particular, we obtain promising results on ImageNet, and outperform several semi-supervised learning methods in the low-data regime without the use of any ground-truth annotations. The code is made publicly available at https://github.com/wvangansbeke/Unsupervised-Classification.