元学习用于几次自然语言处理：一项调查

论文标题

元学习用于几次自然语言处理：一项调查

Meta-learning for Few-shot Natural Language Processing: A Survey

论文作者

Yin, Wenpeng

论文摘要

很少有自然语言处理（NLP）是指仅伴随着少数标记示例的NLP任务。这是AI系统必须学会处理的现实挑战。通常，我们依靠收集更多的辅助信息或开发更有效的学习算法。但是，如果从头开始训练，则在高容量模型中基于一般梯度的优化需要在大量标记的示例上进行许多参数的步骤才能表现良好（Snell等，2017）。如果目标任务本身无法提供更多信息，那么如何收集更多配备丰富注释来帮助模型学习的任务？元学习的目的是在各种具有丰富注释的任务上训练模型，以便仅使用几个标签样本来解决新任务。关键想法是训练模型的初始参数，以便在通过零或几个梯度步骤更新参数后，该模型在新任务上具有最大性能。已经有一些有关元学习的调查，例如（Vilalta和Drissi，2002； Vanschoren，2018； Hospedales等，2020）。然而，本文着重于NLP域，尤其是少数弹药应用程序。我们尝试提供更清晰的定义，进度摘要和一些将元学习应用于少数NLP的常见数据集。

Few-shot natural language processing (NLP) refers to NLP tasks that are accompanied with merely a handful of labeled examples. This is a real-world challenge that an AI system must learn to handle. Usually we rely on collecting more auxiliary information or developing a more efficient learning algorithm. However, the general gradient-based optimization in high capacity models, if training from scratch, requires many parameter-updating steps over a large number of labeled examples to perform well (Snell et al., 2017). If the target task itself cannot provide more information, how about collecting more tasks equipped with rich annotations to help the model learning? The goal of meta-learning is to train a model on a variety of tasks with rich annotations, such that it can solve a new task using only a few labeled samples. The key idea is to train the model's initial parameters such that the model has maximal performance on a new task after the parameters have been updated through zero or a couple of gradient steps. There are already some surveys for meta-learning, such as (Vilalta and Drissi, 2002; Vanschoren, 2018; Hospedales et al., 2020). Nevertheless, this paper focuses on NLP domain, especially few-shot applications. We try to provide clearer definitions, progress summary and some common datasets of applying meta-learning to few-shot NLP.

下载PDF全文

下载文献需遵守相关版权规定

论文标题