论文标题
超越简单的元学习:多域,活跃且连续的几次学习的多功能模型
Beyond Simple Meta-Learning: Multi-Purpose Models for Multi-Domain, Active and Continual Few-Shot Learning
论文作者
论文摘要
现代深度学习需要大规模的广泛标记的数据集以进行培训。很少有学习的学习旨在通过从少数标记的示例中有效学习来减轻这个问题。在先前提出的几个视觉分类器中,假定做出分类器决策的特征歧管具有不相关的特征维度和均匀的特征方差。在这项工作中,我们专注于通过提出在低标签制度中运行的方差敏感模型来解决该假设产生的局限性。第一种方法是简单的CNAP,采用了基于层次的正规化Mahalanobis-Distance分类器,并结合了最先进的神经自适应特征提取器,以在Meta-Dataset,Mini-ImageNet和Tiered-Imagenet基准测试上实现强大的性能。我们进一步将这种方法扩展到了转导学习设置,提出了偏置CNAP。这种偏置方法将软K-均值参数改进过程与两步任务编码器结合在一起,以使用未标记的数据来提高测试时间分类精度。 Transductive CNAPS在元数据上实现了最先进的表现。最后,我们探讨了我们的方法(简单和偏置)的使用来“开箱即用”持续和积极的学习。大规模基准的广泛实验说明了这种模型的鲁棒性和多功能性。所有训练有素的模型检查点和相应的源代码均已公开可用。
Modern deep learning requires large-scale extensively labelled datasets for training. Few-shot learning aims to alleviate this issue by learning effectively from few labelled examples. In previously proposed few-shot visual classifiers, it is assumed that the feature manifold, where classifier decisions are made, has uncorrelated feature dimensions and uniform feature variance. In this work, we focus on addressing the limitations arising from this assumption by proposing a variance-sensitive class of models that operates in a low-label regime. The first method, Simple CNAPS, employs a hierarchically regularized Mahalanobis-distance based classifier combined with a state of the art neural adaptive feature extractor to achieve strong performance on Meta-Dataset, mini-ImageNet and tiered-ImageNet benchmarks. We further extend this approach to a transductive learning setting, proposing Transductive CNAPS. This transductive method combines a soft k-means parameter refinement procedure with a two-step task encoder to achieve improved test-time classification accuracy using unlabelled data. Transductive CNAPS achieves state of the art performance on Meta-Dataset. Finally, we explore the use of our methods (Simple and Transductive) for "out of the box" continual and active learning. Extensive experiments on large scale benchmarks illustrate robustness and versatility of this, relatively speaking, simple class of models. All trained model checkpoints and corresponding source codes have been made publicly available.