使用静态合成梯度模块的进料向前的边缘微调

论文标题

使用静态合成梯度模块的进料向前的边缘微调

Feed-Forward On-Edge Fine-tuning Using Static Synthetic Gradient Modules

论文作者

Neven, Robby, Verhelst, Marian, Tuytelaars, Tinne, Goedemé, Toon

论文摘要

通常可以避免在嵌入式设备上进行深度学习模型，因为这需要更多的记忆，计算和对推理的力量。在这项工作中，我们专注于降低存储所有激活所需的内存量，这是在向后通过以计算梯度的过程中所需的。取而代之的是，在正向通过期间，静态合成梯度模块（SGM）预测每一层的梯度。这允许以馈送方式训练模型，而不必存储所有激活。我们在机器人抓住方案上测试了我们的方法，在该方案中，机器人只需要在一个单个演示中学习掌握新对象。首先以元学习方式训练SGMS在一组常见对象上进行微调，SGMS为模型提供了准确的梯度，以成功地学习掌握新对象。我们已经表明，我们的方法与使用标准反向传播具有可比的结果。

Training deep learning models on embedded devices is typically avoided since this requires more memory, computation and power over inference. In this work, we focus on lowering the amount of memory needed for storing all activations, which are required during the backward pass to compute the gradients. Instead, during the forward pass, static Synthetic Gradient Modules (SGMs) predict gradients for each layer. This allows training the model in a feed-forward manner without having to store all activations. We tested our method on a robot grasping scenario where a robot needs to learn to grasp new objects given only a single demonstration. By first training the SGMs in a meta-learning manner on a set of common objects, during fine-tuning, the SGMs provided the model with accurate gradients to successfully learn to grasp new objects. We have shown that our method has comparable results to using standard backpropagation.

下载PDF全文

下载文献需遵守相关版权规定

论文标题