语言的阴暗面：Darknet中的预训练的变压器

论文标题

语言的阴暗面：Darknet中的预训练的变压器

The Dark Side of the Language: Pre-trained Transformers in the DarkNet

论文作者

Ranaldi, Leonardo, Nourbakhsh, Aria, Patrizi, Arianna, Ruzzetti, Elena Sofia, Onorati, Dario, Fallucchi, Francesca, Zanzotto, Fabio Massimo

论文摘要

预训练的变压器在许多NLP任务中都在挑战人类表演。用于预训练的大量数据集似乎是他们在现有任务上成功的关键。在本文中，我们探讨了一系列预先训练的自然语言理解模型如何在Darknet语料库上通过分类任务提供的绝对看不见的句子执行。令人惊讶的是，结果表明，即使在微调后，句法和词汇神经网络即使在预训练的变压器上也具有标准性。只有在我们称之为极端域的适应性之后，即在所有新型语料库中使用蒙版语言模型任务进行重新训练，预先训练的变压器才能达到其标准的高结果。这表明，巨大的培训前语料库可能会给变形金刚出乎意料的帮助，因为它们受到了许多可能的句子的影响。

Pre-trained Transformers are challenging human performances in many NLP tasks. The massive datasets used for pre-training seem to be the key to their success on existing tasks. In this paper, we explore how a range of pre-trained Natural Language Understanding models perform on definitely unseen sentences provided by classification tasks over a DarkNet corpus. Surprisingly, results show that syntactic and lexical neural networks perform on par with pre-trained Transformers even after fine-tuning. Only after what we call extreme domain adaptation, that is, retraining with the masked language model task on all the novel corpus, pre-trained Transformers reach their standard high results. This suggests that huge pre-training corpora may give Transformers unexpected help since they are exposed to many of the possible sentences.

下载PDF全文

下载文献需遵守相关版权规定

论文标题