选择性注释使语言模型更好

论文标题

选择性注释使语言模型更好

Selective Annotation Makes Language Models Better Few-Shot Learners

论文作者

Su, Hongjin, Kasai, Jungo, Wu, Chen Henry, Shi, Weijia, Wang, Tianlu, Xin, Jiayi, Zhang, Rui, Ostendorf, Mari, Zettlemoyer, Luke, Smith, Noah A., Yu, Tao

论文摘要

许多最新的自然语言任务方法都建立在大型语言模型的非凡能力上。大型语言模型可以执行在文章中的学习，他们可以从几个任务演示中学习新任务，而无需任何参数更新。这项工作研究了对新自然语言任务的数据集创建数据集的含义。我们与最近的文化学习方法背道而驰，我们制定了一个注释效率的两步框架：选择性注释，该框架选择了一个示例池，以提前从未标记的数据中从未标记的数据中进行注释，然后迅速检索，以便从测试时间从注释的池中从注释的池中检索任务示例。基于此框架，我们提出了一种基于图形的选择性注释方法VOKE-K，以选择各种代表性的示例进行注释。在10个数据集上进行了广泛的实验（涵盖分类，常识性推理，对话和文本/代码生成）表明，我们的选择性注释方法可以通过很大的利润来提高任务性能。与随机选择示例进行注释相比，在注释预算为18/100的注释预算下，投票-K的相对增益为12.9％/11.4％。与最先进的监督固定方法相比，它在10个任务中的注释成本降低了10-100倍，其性能相似。我们在各种情况下进一步分析了框架的有效性：具有不同大小的语言模型，替代选择性注释方法以及有测试数据域移动的情况。我们希望我们的研究将作为数据注释的基础，因为大型语言模型越来越多地应用于新任务。我们的代码可从https://github.com/hkunlp/icl-selactive-annotation获得。

Many recent approaches to natural language tasks are built on the remarkable abilities of large language models. Large language models can perform in-context learning, where they learn a new task from a few task demonstrations, without any parameter updates. This work examines the implications of in-context learning for the creation of datasets for new natural language tasks. Departing from recent in-context learning methods, we formulate an annotation-efficient, two-step framework: selective annotation that chooses a pool of examples to annotate from unlabeled data in advance, followed by prompt retrieval that retrieves task examples from the annotated pool at test time. Based on this framework, we propose an unsupervised, graph-based selective annotation method, voke-k, to select diverse, representative examples to annotate. Extensive experiments on 10 datasets (covering classification, commonsense reasoning, dialogue, and text/code generation) demonstrate that our selective annotation method improves the task performance by a large margin. On average, vote-k achieves a 12.9%/11.4% relative gain under an annotation budget of 18/100, as compared to randomly selecting examples to annotate. Compared to state-of-the-art supervised finetuning approaches, it yields similar performance with 10-100x less annotation cost across 10 tasks. We further analyze the effectiveness of our framework in various scenarios: language models with varying sizes, alternative selective annotation methods, and cases where there is a test data domain shift. We hope that our studies will serve as a basis for data annotations as large language models are increasingly applied to new tasks. Our code is available at https://github.com/HKUNLP/icl-selective-annotation.

下载PDF全文

下载文献需遵守相关版权规定

论文标题