建筑在持续学习中很重要

论文标题

建筑在持续学习中很重要

Architecture Matters in Continual Learning

论文作者

Mirzadeh, Seyed Iman, Chaudhry, Arslan, Yin, Dong, Nguyen, Timothy, Pascanu, Razvan, Gorur, Dilan, Farajtabar, Mehrdad

论文摘要

持续学习的大量研究致力于通过设计新的算法来克服灾难性的遗忘神经网络，这些算法对分布变化是有力的。但是，这些作品中的大多数都严格着重于“固定神经网络架构”的持续学习的“算法”部分，并且大多忽略了使用不同体系结构的含义。即使是几种修改模型的现有持续学习方法也假设了固定的体系结构，并旨在开发一种在整个学习体验中有效地使用该模型的算法。但是，在这项工作中，我们证明了体系结构的选择可以显着影响持续的学习绩效，并且不同的体系结构在记住以前任务和学习新任务的能力之间导致了不同的权衡。此外，我们研究各种建筑决策的影响，我们的发现需要提高不断学习绩效的最佳实践和建议。

A large body of research in continual learning is devoted to overcoming the catastrophic forgetting of neural networks by designing new algorithms that are robust to the distribution shifts. However, the majority of these works are strictly focused on the "algorithmic" part of continual learning for a "fixed neural network architecture", and the implications of using different architectures are mostly neglected. Even the few existing continual learning methods that modify the model assume a fixed architecture and aim to develop an algorithm that efficiently uses the model throughout the learning experience. However, in this work, we show that the choice of architecture can significantly impact the continual learning performance, and different architectures lead to different trade-offs between the ability to remember previous tasks and learning new ones. Moreover, we study the impact of various architectural decisions, and our findings entail best practices and recommendations that can improve the continual learning performance.

下载PDF全文

下载文献需遵守相关版权规定

论文标题