使用局灶性损失校准深层神经网络

论文标题

使用局灶性损失校准深层神经网络

Calibrating Deep Neural Networks using Focal Loss

论文作者

Mukhoti, Jishnu, Kulharia, Viveka, Sanyal, Amartya, Golodetz, Stuart, Torr, Philip H. S., Dokania, Puneet K.

论文摘要

深度神经网络（DNNS）的误解 - 模型的信心与正确性之间的不匹配使他们的预测难以依赖。理想情况下，我们希望网络准确，校准和自信。我们表明，与标准的横向损失相反，局灶性损失[Lin等。 Al。，2017年]允许我们学习已经经过很好校准的模型。当与温度缩放相结合时，在保留准确性的同时，它会产生最新的校准模型。我们对导致错误校准的因素进行了彻底的分析，并使用我们从中收集的见解来证明局灶性损失的经验表现出色。为了促进在实践中使用焦点损失，我们还提供了一种原则性的方法，可以自动选择参与损失函数的超参数。我们对各种计算机视觉和NLP数据集进行了广泛的实验，并具有多种网络体系结构，并表明我们的方法在几乎所有情况下都无法损害准确性，而不会损害准确性。代码可从https://github.com/torrvision/focal_calibration获得。

Miscalibration - a mismatch between a model's confidence and its correctness - of Deep Neural Networks (DNNs) makes their predictions hard to rely on. Ideally, we want networks to be accurate, calibrated and confident. We show that, as opposed to the standard cross-entropy loss, focal loss [Lin et. al., 2017] allows us to learn models that are already very well calibrated. When combined with temperature scaling, whilst preserving accuracy, it yields state-of-the-art calibrated models. We provide a thorough analysis of the factors causing miscalibration, and use the insights we glean from this to justify the empirically excellent performance of focal loss. To facilitate the use of focal loss in practice, we also provide a principled approach to automatically select the hyperparameter involved in the loss function. We perform extensive experiments on a variety of computer vision and NLP datasets, and with a wide variety of network architectures, and show that our approach achieves state-of-the-art calibration without compromising on accuracy in almost all cases. Code is available at https://github.com/torrvision/focal_calibration.

下载PDF全文

下载文献需遵守相关版权规定

论文标题