论文标题
表示投影不变性缓解表示崩溃
Representation Projection Invariance Mitigates Representation Collapse
论文作者
论文摘要
通过预训练的语言模型学到的微调语境化表示,在NLP中仍然是一种普遍的实践。但是,微调会导致表示降解(也称为表示崩溃),这可能导致不稳定,次优性能和较弱的概括。 在本文中,我们提出了一种新颖的正则化方法,是一种新颖的正则化方法,可通过劝阻表示表示的不良变化,以维持表示的信息内容并减少表示形式的崩溃。我们研究了所提出的正则化的经验行为,与13个语言理解任务(胶合基准和六个附加数据集)相比,我们的经验行为与5个可比基线相比。在评估内域性能时,Repina在大多数任务上始终优于其他基准(13分之10)。我们还证明了它在几个射击环境中的有效性和标记扰动的鲁棒性。作为副产品,我们扩展了先前关于表示崩溃的研究,并提出了几个指标来量化它。我们的经验发现表明,我们的方法在缓解表示崩溃方面更有效。
Fine-tuning contextualized representations learned by pre-trained language models remains a prevalent practice in NLP. However, fine-tuning can lead to representation degradation (also known as representation collapse), which may result in instability, sub-optimal performance, and weak generalization. In this paper, we propose Representation Projection Invariance (REPINA), a novel regularization method to maintain the information content of representation and reduce representation collapse during fine-tuning by discouraging undesirable changes in the representations. We study the empirical behavior of the proposed regularization in comparison to 5 comparable baselines across 13 language understanding tasks (GLUE benchmark and six additional datasets). When evaluating in-domain performance, REPINA consistently outperforms other baselines on most tasks (10 out of 13). We also demonstrate its effectiveness in few-shot settings and robustness to label perturbation. As a by-product, we extend previous studies of representation collapse and propose several metrics to quantify it. Our empirical findings show that our approach is significantly more effective at mitigating representation collapse.