SimReg：回归是一种简单而有效的自我监督知识蒸馏的工具

论文标题

SimReg：回归是一种简单而有效的自我监督知识蒸馏的工具

SimReg: Regression as a Simple Yet Effective Tool for Self-supervised Knowledge Distillation

论文作者

Navaneet, K L, Koohpayegani, Soroush Abbasi, Tejankar, Ajinkya, Pirsiavash, Hamed

论文摘要

特征回归是一种简单的方法，可以将大型神经网络模型提升为较小的神经网络模型。我们表明，通过对网络体系结构的简单更改，回归可以优于从自我监督模型中蒸馏的知识蒸馏的更为复杂的最新方法。令人惊讶的是，即使仅在蒸馏过程中使用并在下游任务中丢弃，将多层感知头添加到CNN主链中也是有益的。因此，可以使用更深层的非线性预测来准确地模仿教师，而不会改变推理体系结构和时间。此外，我们利用独立的投影头同时提炼多个教师网络。我们还发现，使用相同的弱增强图像与教师和学生网络的输入有助于蒸馏。 Imagenet数据集的实验证明了各种自我监管的蒸馏设置中提出的变化的功效。

Feature regression is a simple way to distill large neural network models to smaller ones. We show that with simple changes to the network architecture, regression can outperform more complex state-of-the-art approaches for knowledge distillation from self-supervised models. Surprisingly, the addition of a multi-layer perceptron head to the CNN backbone is beneficial even if used only during distillation and discarded in the downstream task. Deeper non-linear projections can thus be used to accurately mimic the teacher without changing inference architecture and time. Moreover, we utilize independent projection heads to simultaneously distill multiple teacher networks. We also find that using the same weakly augmented image as input for both teacher and student networks aids distillation. Experiments on ImageNet dataset demonstrate the efficacy of the proposed changes in various self-supervised distillation settings.

下载PDF全文

下载文献需遵守相关版权规定

论文标题