灾难性遗忘的解剖学：隐藏的表示和任务语义

论文标题

灾难性遗忘的解剖学：隐藏的表示和任务语义

Anatomy of Catastrophic Forgetting: Hidden Representations and Task Semantics

论文作者

Ramasesh, Vinay V., Dyer, Ethan, Raghu, Maithra

论文摘要

开发多功能机器学习系统的核心挑战是灾难性的遗忘：按顺序训练的任务训练的模型将在早期任务上遭受大量的性能下降。尽管灾难性的遗忘无处不在，但对基本过程及其原因的理解有限。在本文中，我们解决了这一重要知识差距，研究忘记如何影响神经网络模型中的表示形式。通过代表性分析技术，我们发现更深的层是遗忘的根源。支持这一点，对减轻忘记的方法的研究说明了它们的作用是为了稳定更深的层。这些见解使分析论点和经验图的发展与忘记任务之间的代表性相似性的程度有关。与这张照片一致，我们观察到具有中间相似性的任务序列发生了最大遗忘。我们对标准CIFAR-10设置进行经验研究，并引入了一种基于CIFAR-100的新型任务，近似于现实的输入分布变化。

A central challenge in developing versatile machine learning systems is catastrophic forgetting: a model trained on tasks in sequence will suffer significant performance drops on earlier tasks. Despite the ubiquity of catastrophic forgetting, there is limited understanding of the underlying process and its causes. In this paper, we address this important knowledge gap, investigating how forgetting affects representations in neural network models. Through representational analysis techniques, we find that deeper layers are disproportionately the source of forgetting. Supporting this, a study of methods to mitigate forgetting illustrates that they act to stabilize deeper layers. These insights enable the development of an analytic argument and empirical picture relating the degree of forgetting to representational similarity between tasks. Consistent with this picture, we observe maximal forgetting occurs for task sequences with intermediate similarity. We perform empirical studies on the standard split CIFAR-10 setup and also introduce a novel CIFAR-100 based task approximating realistic input distribution shift.

下载PDF全文

下载文献需遵守相关版权规定

论文标题