用嘈杂的标签来利用嵌入可靠学习的功能

论文标题

用嘈杂的标签来利用嵌入可靠学习的功能

Towards Harnessing Feature Embedding for Robust Learning with Noisy Labels

论文作者

Zhang, Chuang, Shen, Li, Yang, Jian, Gong, Chen

论文摘要

深神经网络（DNN）的记忆效应在最近的标签噪声学习方法中起着关键作用。为了利用这种效果，已经广泛采用了基于模型预测的方法，该方法旨在在学习的早期阶段利用DNN的输出来纠正嘈杂的标签。但是，我们观察到该模型在标签预测期间会犯错误，从而导致性能不令人满意。相比之下，在学习早期阶段产生的特征表现出更好的鲁棒性。受这一观察的启发，在本文中，我们提出了一种基于特征嵌入的新方法，用于用标签噪声，称为标签噪声（LEND）。要具体而言，我们首先根据当前嵌入式特征计算一个相似性矩阵，以捕获训练数据的局部结构。然后，附近标记的数据（\ textIt {i.e。}，标签噪声稀释）使错误标记的数据携带的嘈杂的监督信号淹没了，其有效性是由特征嵌入的固有鲁棒性保证的。最后，带有稀释标签的培训数据进一步用于培训强大的分类器。从经验上讲，我们通过将我们的贷款与几种代表性的强大学习方法进行比较，对合成和现实世界嘈杂数据集进行了广泛的实验。结果验证了我们贷款的有效性。

The memorization effect of deep neural networks (DNNs) plays a pivotal role in recent label noise learning methods. To exploit this effect, the model prediction-based methods have been widely adopted, which aim to exploit the outputs of DNNs in the early stage of learning to correct noisy labels. However, we observe that the model will make mistakes during label prediction, resulting in unsatisfactory performance. By contrast, the produced features in the early stage of learning show better robustness. Inspired by this observation, in this paper, we propose a novel feature embedding-based method for deep learning with label noise, termed LabEl NoiseDilution (LEND). To be specific, we first compute a similarity matrix based on current embedded features to capture the local structure of training data. Then, the noisy supervision signals carried by mislabeled data are overwhelmed by nearby correctly labeled ones (\textit{i.e.}, label noise dilution), of which the effectiveness is guaranteed by the inherent robustness of feature embedding. Finally, the training data with diluted labels are further used to train a robust classifier. Empirically, we conduct extensive experiments on both synthetic and real-world noisy datasets by comparing our LEND with several representative robust learning approaches. The results verify the effectiveness of our LEND.

下载PDF全文

下载文献需遵守相关版权规定

论文标题