Zeroprompt：基于缩放提示的1,000个任务的缩放预处理可改善零拍的概括

论文标题

Zeroprompt：基于缩放提示的1,000个任务的缩放预处理可改善零拍的概括

ZeroPrompt: Scaling Prompt-Based Pretraining to 1,000 Tasks Improves Zero-Shot Generalization

论文作者

Xu, Hanwei, Chen, Yujun, Du, Yulun, Shao, Nan, Wang, Yanggang, Li, Haiyu, Yang, Zhilin

论文摘要

我们提出了一种多任务预训练方法，以零弹性概括，重点放在任务缩放和零射击提示上。虽然以前的模型仅在几十个任务上进行了培训，但我们首次使用现实世界数据首次扩展到1000个任务。这导致了一个至关重要的发现，即任务缩放可以是模型缩放的有效替代方法。即，由于大量任务，模型大小对性能几乎没有影响。我们的结果表明，任务缩放可以大大提高训练效率30倍。此外，我们提出了一种提示方法，该方法结合了一种遗传算法，以自动为看不见的任务搜索最佳提示，并进行其他一些改进。从经验上讲，Zeroprompt显着提高了各种学术和生产数据集中零射门学习的效率和表现。

We propose a multitask pretraining approach ZeroPrompt for zero-shot generalization, focusing on task scaling and zero-shot prompting. While previous models are trained on only a few dozen tasks, we scale to 1,000 tasks for the first time using real-world data. This leads to a crucial discovery that task scaling can be an efficient alternative to model scaling; i.e., the model size has little impact on performance with an extremely large number of tasks. Our results show that task scaling can substantially improve training efficiency by 30 times in FLOPs. Moreover, we present a prompting method that incorporates a genetic algorithm to automatically search for the best prompt for unseen tasks, along with a few other improvements. Empirically, ZeroPrompt substantially improves both the efficiency and the performance of zero-shot learning across a variety of academic and production datasets.

下载PDF全文

下载文献需遵守相关版权规定

论文标题