积极的一致培训：朝着无回归模型更新迈进

论文标题

积极的一致培训：朝着无回归模型更新迈进

Positive-Congruent Training: Towards Regression-Free Model Updates

论文作者

Yan, Sijie, Xiong, Yuanjun, Kundu, Kaustav, Yang, Shuo, Deng, Siqi, Wang, Meng, Xia, Wei, Soatto, Stefano

论文摘要

减少AI系统不同版本的行为的不一致在实践中与减少其整体误差一样重要。在图像分类中，样本的不一致似乎是“负翻转”：一个新模型错误地预测了由旧模型正确分类的测试样本的输出。正乘积（PC）培训旨在降低错误率，同时降低负面翻转，从而最大程度地与参考模型最大程度地达到正面预测，这与模型蒸馏不同。我们提出了一种用于PC训练，焦点蒸馏的简单方法，该方法通过对正确分类的样品进行更多权重来实现与参考模型的一致性。我们还发现，如果可以选择参考模型本身作为多个深神经网络的集合，则可以进一步降低负面额，而不会影响新模型的准确性。

Reducing inconsistencies in the behavior of different versions of an AI system can be as important in practice as reducing its overall error. In image classification, sample-wise inconsistencies appear as "negative flips": A new model incorrectly predicts the output for a test sample that was correctly classified by the old (reference) model. Positive-congruent (PC) training aims at reducing error rate while at the same time reducing negative flips, thus maximizing congruency with the reference model only on positive predictions, unlike model distillation. We propose a simple approach for PC training, Focal Distillation, which enforces congruence with the reference model by giving more weights to samples that were correctly classified. We also found that, if the reference model itself can be chosen as an ensemble of multiple deep neural networks, negative flips can be further reduced without affecting the new model's accuracy.

下载PDF全文

下载文献需遵守相关版权规定

论文标题