补充标签学习与增强课程

论文标题

补充标签学习与增强课程

Complementary Labels Learning with Augmented Classes

论文作者

Li, Zhongnian, Zhang, Jian, Xu, Mengting, Xu, Xinzheng, Zhang, Daoqiang

论文摘要

补充标签学习（CLL）在许多现实世界中出现，例如私人问题分类和在线学习，旨在减轻与标准监督学习相比的注释成本。不幸的是，大多数以前的CLL算法都处于稳定的环境中，而不是一个开放且动态的方案，在该环境中，从训练过程中未看到的增强类收集的数据可能会在测试阶段出现。在本文中，我们提出了一个新的问题设置，称为补充标签学习，具有增强类（CLLAC），这带来了挑战，即通过互补标签培训的分类器不仅应该能够准确地对观察类的实例进行分类，而且还可以从测试阶段的增强类中识别实例。具体而言，通过使用未标记的数据，我们提出了一个无偏见的CLLAC分类风险估计量，这可以证明是一致的。此外，我们为提出的方法提供了限制的概括误差，该方法表明，对于估计误差，达到了最佳参数收敛率。最后，几个基准数据集的实验结果验证了所提出的方法的有效性。

Complementary Labels Learning (CLL) arises in many real-world tasks such as private questions classification and online learning, which aims to alleviate the annotation cost compared with standard supervised learning. Unfortunately, most previous CLL algorithms were in a stable environment rather than an open and dynamic scenarios, where data collected from unseen augmented classes in the training process might emerge in the testing phase. In this paper, we propose a novel problem setting called Complementary Labels Learning with Augmented Classes (CLLAC), which brings the challenge that classifiers trained by complementary labels should not only be able to classify the instances from observed classes accurately, but also recognize the instance from the Augmented Classes in the testing phase. Specifically, by using unlabeled data, we propose an unbiased estimator of classification risk for CLLAC, which is guaranteed to be provably consistent. Moreover, we provide generalization error bound for proposed method which shows that the optimal parametric convergence rate is achieved for estimation error. Finally, the experimental results on several benchmark datasets verify the effectiveness of the proposed method.

下载PDF全文

下载文献需遵守相关版权规定

论文标题