论文标题
Skillnet-NLG:通用的自然语言生成具有稀疏激活的方法
SkillNet-NLG: General-Purpose Natural Language Generation with a Sparsely Activated Approach
论文作者
论文摘要
我们提出了Skillnet-NLG,这是一种稀疏激活的方法,可以通过一种模型处理许多自然语言生成任务。与始终激活所有参数的传统密集模型不同,SkillNet-NLG选择性地激活参数的相关部分以完成任务,其中相关性由一组预定义的技能控制。这种模型设计的优势在于,它提供了一个精确调整相关技能以有效学习新任务的机会。我们评估中国自然语言生成任务。结果表明,只有一个模型文件,SkillNet-NLG在五个任务中的四个任务上都优于先前的最佳性能方法。 SkillNet-NLG的性能要比两个多任务学习基线(一个密集的模型和一个杂种模型)更好,并且可以达到与特定于任务的模型相当的性能。最后,在适应新任务时,SkillNet-NLG超过了基线系统。
We present SkillNet-NLG, a sparsely activated approach that handles many natural language generation tasks with one model. Different from traditional dense models that always activate all the parameters, SkillNet-NLG selectively activates relevant parts of the parameters to accomplish a task, where the relevance is controlled by a set of predefined skills. The strength of such model design is that it provides an opportunity to precisely adapt relevant skills to learn new tasks effectively. We evaluate on Chinese natural language generation tasks. Results show that, with only one model file, SkillNet-NLG outperforms previous best performance methods on four of five tasks. SkillNet-NLG performs better than two multi-task learning baselines (a dense model and a Mixture-of-Expert model) and achieves comparable performance to task-specific models. Lastly, SkillNet-NLG surpasses baseline systems when being adapted to new tasks.