论文标题

G-MAP:域任务的一般内存启动的预训练语言模型

G-MAP: General Memory-Augmented Pre-trained Language Model for Domain Tasks

论文作者

Wan, Zhongwei, Yin, Yichun, Zhang, Wei, Shi, Jiaxin, Shang, Lifeng, Chen, Guangyong, Jiang, Xin, Liu, Qun

论文摘要

最近,已经提出了特定领域的PLM,以通过继续使用特定领域的Corpora预先培训一般PLM来提高特定领域(例如生物医学和计算机科学)的任务性能。然而,这种域自适应的预训练(DAPT; Gururangan等人(2020))倾向于忘记一般PLM所获得的先前的常识,这导致了灾难性的遗忘现象和次要表现。为了减轻这个问题,我们提出了一个新的通用内存增强预训练的语言模型(G-MAP)的新框架,该框架通过由冷冻的一般PLM构建的内存表示,而不会失去任何常识,从而增强了特定于领域的PLM。具体而言,我们提出了一个新的内存仪表,并基于它,探索了不同的增强策略以构建内存表示,然后将其自适应地融合到特定于域的PLM中。我们证明了G-MAP对任务的各种领域(生物医学和计算机科学出版物,新闻和评论)以及不同种类(文本分类,QA,NER)的有效性,并且广泛的结果表明,所提出的G-MAP可以在所有任务上实现SOTA结果。

Recently, domain-specific PLMs have been proposed to boost the task performance of specific domains (e.g., biomedical and computer science) by continuing to pre-train general PLMs with domain-specific corpora. However, this Domain-Adaptive Pre-Training (DAPT; Gururangan et al. (2020)) tends to forget the previous general knowledge acquired by general PLMs, which leads to a catastrophic forgetting phenomenon and sub-optimal performance. To alleviate this problem, we propose a new framework of General Memory Augmented Pre-trained Language Model (G-MAP), which augments the domain-specific PLM by a memory representation built from the frozen general PLM without losing any general knowledge. Specifically, we propose a new memory-augmented layer, and based on it, different augmented strategies are explored to build the memory representation and then adaptively fuse it into the domain-specific PLM. We demonstrate the effectiveness of G-MAP on various domains (biomedical and computer science publications, news, and reviews) and different kinds (text classification, QA, NER) of tasks, and the extensive results show that the proposed G-MAP can achieve SOTA results on all tasks.

扫码加入交流群

加入微信交流群

微信交流群二维码

扫码加入学术交流群,获取更多资源