门票：通过演奏彩票来揭示域名总语言模型

论文标题

门票：通过演奏彩票来揭示域名总语言模型

Doge Tickets: Uncovering Domain-general Language Models by Playing Lottery Tickets

论文作者

Yang, Yi, Zhang, Chen, Wang, Benyou, Song, Dawei

论文摘要

由于其小的学习偏见，过度参数化的模型（通常是预审前的语言模型（LMS））表现出具有吸引力的表现力。但是，LMS的巨大学习能力也会导致较大的学习差异。在一项试点研究中，我们发现，当面对多个领域时，参数的关键部分以特定于领域的方式出乎意料地行为，而其他参数则在域中行为。在这种现象中，我们首次认为，域总参数可以支撑一个可以从原始LM得出的域总LM。为了揭示域中LM，我们建议通过演奏彩票（称为Doge门票）来识别域总参数。为了干预彩票，我们提出了一个域总分，该得分描述了参数与方差相关联的方式。全面的实验是在亚马逊，MNLI和Ontonotes数据集上进行的。结果表明，与一系列竞争基线相比，Doge门票获得了改进的分类概括。分析结果进一步暗示了域总参数的存在和门票票的绩效一致性。

Over-parameterized models, typically pretrained language models (LMs), have shown an appealing expressive power due to their small learning bias. However, the huge learning capacity of LMs can also lead to large learning variance. In a pilot study, we find that, when faced with multiple domains, a critical portion of parameters behave unexpectedly in a domain-specific manner while others behave in a domain-general one. Motivated by this phenomenon, we for the first time posit that domain-general parameters can underpin a domain-general LM that can be derived from the original LM. To uncover the domain-general LM, we propose to identify domain-general parameters by playing lottery tickets (dubbed doge tickets). In order to intervene the lottery, we propose a domain-general score, which depicts how domain-invariant a parameter is by associating it with the variance. Comprehensive experiments are conducted on the Amazon, Mnli and OntoNotes datasets. The results show that the doge tickets obtains an improved out-of-domain generalization in comparison with a range of competitive baselines. Analysis results further hint the existence of domain-general parameters and the performance consistency of doge tickets.

下载PDF全文

下载文献需遵守相关版权规定

论文标题