通过剪辑模型预测来减轻嘈杂标签的记忆

论文标题

通过剪辑模型预测来减轻嘈杂标签的记忆

Mitigating Memorization of Noisy Labels by Clipping the Model Prediction

论文作者

Wei, Hongxin, Zhuang, Huiping, Xie, Renchunzi, Feng, Lei, Niu, Gang, An, Bo, Li, Yixuan

论文摘要

在存在嘈杂标签的情况下，设计强大的损失功能对于确保深神经网络的概括性能至关重要。由于其无限性，跨熵（CE）损失已显示出对嘈杂标签的稳健性。为了减轻此问题，现有的作品通常会在对称条件下设计专门的鲁棒损失，这通常会导致不足的问题。在本文中，我们的关键思想是诱导损失在logit层面上，从而普遍增强现有损失的噪声稳健性。具体而言，我们提出了logit剪辑（LogitClip），该logit剪辑将logit向量的范围夹住，以确保其在上限上由常数界定。以这种方式，配备了LogitClip方法的CE损失有效地界定，从而减轻了具有嘈杂标签的示例过度拟合。此外，我们提出了理论分析，以证明LogitClip的噪声能力。广泛的实验表明，LogitClip不仅显着提高了CE损失的噪声稳健性，而且还大致增强了流行稳健损失的概括性能。

In the presence of noisy labels, designing robust loss functions is critical for securing the generalization performance of deep neural networks. Cross Entropy (CE) loss has been shown to be not robust to noisy labels due to its unboundedness. To alleviate this issue, existing works typically design specialized robust losses with the symmetric condition, which usually lead to the underfitting issue. In this paper, our key idea is to induce a loss bound at the logit level, thus universally enhancing the noise robustness of existing losses. Specifically, we propose logit clipping (LogitClip), which clamps the norm of the logit vector to ensure that it is upper bounded by a constant. In this manner, CE loss equipped with our LogitClip method is effectively bounded, mitigating the overfitting to examples with noisy labels. Moreover, we present theoretical analyses to certify the noise-tolerant ability of LogitClip. Extensive experiments show that LogitClip not only significantly improves the noise robustness of CE loss, but also broadly enhances the generalization performance of popular robust losses.

下载PDF全文

下载文献需遵守相关版权规定

论文标题