表示投影不变性缓解表示崩溃

论文标题

表示投影不变性缓解表示崩溃

Representation Projection Invariance Mitigates Representation Collapse

论文作者

Razdaibiedina, Anastasia, Khetan, Ashish, Karnin, Zohar, Khashabi, Daniel, Kapoor, Vishaal, Madan, Vivek

论文摘要

通过预训练的语言模型学到的微调语境化表示，在NLP中仍然是一种普遍的实践。但是，微调会导致表示降解（也称为表示崩溃），这可能导致不稳定，次优性能和较弱的概括。在本文中，我们提出了一种新颖的正则化方法，是一种新颖的正则化方法，可通过劝阻表示表示的不良变化，以维持表示的信息内容并减少表示形式的崩溃。我们研究了所提出的正则化的经验行为，与13个语言理解任务（胶合基准和六个附加数据集）相比，我们的经验行为与5个可比基线相比。在评估内域性能时，Repina在大多数任务上始终优于其他基准（13分之10）。我们还证明了它在几个射击环境中的有效性和标记扰动的鲁棒性。作为副产品，我们扩展了先前关于表示崩溃的研究，并提出了几个指标来量化它。我们的经验发现表明，我们的方法在缓解表示崩溃方面更有效。

Fine-tuning contextualized representations learned by pre-trained language models remains a prevalent practice in NLP. However, fine-tuning can lead to representation degradation (also known as representation collapse), which may result in instability, sub-optimal performance, and weak generalization. In this paper, we propose Representation Projection Invariance (REPINA), a novel regularization method to maintain the information content of representation and reduce representation collapse during fine-tuning by discouraging undesirable changes in the representations. We study the empirical behavior of the proposed regularization in comparison to 5 comparable baselines across 13 language understanding tasks (GLUE benchmark and six additional datasets). When evaluating in-domain performance, REPINA consistently outperforms other baselines on most tasks (10 out of 13). We also demonstrate its effectiveness in few-shot settings and robustness to label perturbation. As a by-product, we extend previous studies of representation collapse and propose several metrics to quantify it. Our empirical findings show that our approach is significantly more effective at mitigating representation collapse.

下载PDF全文

下载文献需遵守相关版权规定

论文标题