论文标题
梯度作为深度表示学习的特征
Gradients as Features for Deep Representation Learning
论文作者
论文摘要
我们解决了深层表示学习的挑战性问题 - 预先培训的深网对不同任务的有效改编。具体来说,我们建议探索基于梯度的功能。这些特征是指定特定于任务损失的模型参数的梯度。我们的关键创新是线性模型的设计,该模型同时结合了预训练网络的梯度和激活。我们表明,我们的模型为基础深层模型提供了局部线性近似,并讨论了重要的理论见解。此外,我们提出了一种有效的算法,用于训练和推断我们的模型,而无需计算实际梯度。我们的方法在多个数据集上的许多表示学习任务上进行了评估,并使用不同的网络体系结构进行了评估。在所有设置中都能获得强劲的结果,并且与我们的理论见解相结合。
We address the challenging problem of deep representation learning--the efficient adaption of a pre-trained deep network to different tasks. Specifically, we propose to explore gradient-based features. These features are gradients of the model parameters with respect to a task-specific loss given an input sample. Our key innovation is the design of a linear model that incorporates both gradient and activation of the pre-trained network. We show that our model provides a local linear approximation to an underlying deep model, and discuss important theoretical insights. Moreover, we present an efficient algorithm for the training and inference of our model without computing the actual gradient. Our method is evaluated across a number of representation-learning tasks on several datasets and using different network architectures. Strong results are obtained in all settings, and are well-aligned with our theoretical insights.