大规模无监督图像集群的自学学习

论文标题

大规模无监督图像集群的自学学习

Self-Supervised Learning for Large-Scale Unsupervised Image Clustering

论文作者

Zheltonozhskii, Evgenii, Baskin, Chaim, Bronstein, Alex M., Mendelson, Avi

论文摘要

无监督的学习一直吸引机器学习研究人员和从业人员，使他们避免了标记数据的昂贵且复杂的过程。但是，对复杂数据的无监督学习具有挑战性，甚至最好的方法表现出的性能远低于其监督对应者。自我监督的深度学习已成为代表计算机视觉学习的强大工具。但是，这些方法尚未在完全无监督的环境中进行评估。在本文中，我们提出了一个简单的方案，用于基于自我监督的表示，用于无监督的分类。我们通过几种最新的自我监管方法评估了提出的方法，表明它可以实现Imagenet分类的竞争结果（Imagenet的精度为39％，具有1000个簇的ImageNet的精度和46％的超集群）。我们建议将无监督的评估添加到一组自制学习的标准基准中。该代码可从https://github.com/randl/kmeans_selfsuper获得

Unsupervised learning has always been appealing to machine learning researchers and practitioners, allowing them to avoid an expensive and complicated process of labeling the data. However, unsupervised learning of complex data is challenging, and even the best approaches show much weaker performance than their supervised counterparts. Self-supervised deep learning has become a strong instrument for representation learning in computer vision. However, those methods have not been evaluated in a fully unsupervised setting. In this paper, we propose a simple scheme for unsupervised classification based on self-supervised representations. We evaluate the proposed approach with several recent self-supervised methods showing that it achieves competitive results for ImageNet classification (39% accuracy on ImageNet with 1000 clusters and 46% with overclustering). We suggest adding the unsupervised evaluation to a set of standard benchmarks for self-supervised learning. The code is available at https://github.com/Randl/kmeans_selfsuper

下载PDF全文

下载文献需遵守相关版权规定

论文标题