深入研究样本损失曲线以拥抱嘈杂和不平衡的数据

论文标题

深入研究样本损失曲线以拥抱嘈杂和不平衡的数据

Delving into Sample Loss Curve to Embrace Noisy and Imbalanced Data

论文作者

Jiang, Shenwang, Li, Jianan, Wang, Ying, Huang, Bo, Zhang, Zhang, Xu, Tingfa

论文摘要

实际收集的培训数据中通常会遇到损坏的标签和阶级失衡，这很容易导致深层神经网络（DNNS）过度拟合。现有方法通过采用样本重新加权策略来减轻这些问题，即通过设计加权功能来重新加权样本。但是，它仅适用于仅包含任何一种类型的数据偏见的培训数据。然而，实际上，在培训数据中偏向具有损坏标签和尾部类别的样本。如何同时处理它们是一个关键，但探索了不足的问题。在本文中，我们发现这两种偏见的样本虽然具有相似的短暂损失，但在损失曲线中具有可区分的趋势和特征，这可以为样本重量分配提供宝贵的先验。在此激励的情况下，我们深入研究了损失曲线，并提出了一种新颖的探测和分配训练策略：在探测阶段，我们在不干预的情况下训练网络在整个有偏见的训练数据上训练网络，并将每个样本的损失曲线记录为附加属性；在分配阶段，我们将所得属性馈送到新设计的曲线感知网络（名为curvenet），以学习识别每个样本的偏差类型，并通过适应性的元学习分配适当的权重。元学习的训练速度也阻碍了其应用。为了解决它，我们提出了一种名为Skip Layer Meta优化（SLMO）的方法，以通过跳过底层来加速训练速度。广泛的合成和真实实验很好地验证了所提出的方法，该方法在多个具有挑战性的基准上实现了最先进的性能。

Corrupted labels and class imbalance are commonly encountered in practically collected training data, which easily leads to over-fitting of deep neural networks (DNNs). Existing approaches alleviate these issues by adopting a sample re-weighting strategy, which is to re-weight sample by designing weighting function. However, it is only applicable for training data containing only either one type of data biases. In practice, however, biased samples with corrupted labels and of tailed classes commonly co-exist in training data. How to handle them simultaneously is a key but under-explored problem. In this paper, we find that these two types of biased samples, though have similar transient loss, have distinguishable trend and characteristics in loss curves, which could provide valuable priors for sample weight assignment. Motivated by this, we delve into the loss curves and propose a novel probe-and-allocate training strategy: In the probing stage, we train the network on the whole biased training data without intervention, and record the loss curve of each sample as an additional attribute; In the allocating stage, we feed the resulting attribute to a newly designed curve-perception network, named CurveNet, to learn to identify the bias type of each sample and assign proper weights through meta-learning adaptively. The training speed of meta learning also blocks its application. To solve it, we propose a method named skip layer meta optimization (SLMO) to accelerate training speed by skipping the bottom layers. Extensive synthetic and real experiments well validate the proposed method, which achieves state-of-the-art performance on multiple challenging benchmarks.

下载PDF全文

下载文献需遵守相关版权规定

论文标题