论文标题

低资源域的适应性,用于组成的构成域名语义解析

Low-Resource Domain Adaptation for Compositional Task-Oriented Semantic Parsing

论文作者

Chen, Xilun, Ghoshal, Asish, Mehdad, Yashar, Zettlemoyer, Luke, Gupta, Sonal

论文摘要

面向任务的语义解析是虚拟助手的关键组成部分,该组成部分负责了解用户的意图(设置提醒,播放音乐等)。深度学习的最新进展使几种成功解析了更复杂的查询方法(Gupta等,2018; Rongali等,2020),但是这些模型需要大量注释的培训数据才能解析新领域的查询(例如,提醒,音乐)。 在本文中,我们着重于将面向任务的语义解析器调整为低资源域,并提出了一种新颖的方法,以减少10倍的数据,以优于监督神经模型。特别是,我们确定了低资源领域适应的两个基本因素:更好的表示学习和更好的培训技术。我们的表示学习使用Bart(Lewis等,2019)来初始化我们的模型,该模型优于以前工作中使用的仅包含的预先训练的表示。此外,我们使用基于优化的元学习训练(Finn等,2017),以改善对低资源域的概括。这种方法在实验中的所有基线方法都大大优于新收集的多域任务的语义解析数据集(TOPV2),我们向公众发布。

Task-oriented semantic parsing is a critical component of virtual assistants, which is responsible for understanding the user's intents (set reminder, play music, etc.). Recent advances in deep learning have enabled several approaches to successfully parse more complex queries (Gupta et al., 2018; Rongali et al.,2020), but these models require a large amount of annotated training data to parse queries on new domains (e.g. reminder, music). In this paper, we focus on adapting task-oriented semantic parsers to low-resource domains, and propose a novel method that outperforms a supervised neural model at a 10-fold data reduction. In particular, we identify two fundamental factors for low-resource domain adaptation: better representation learning and better training techniques. Our representation learning uses BART (Lewis et al., 2019) to initialize our model which outperforms encoder-only pre-trained representations used in previous work. Furthermore, we train with optimization-based meta-learning (Finn et al., 2017) to improve generalization to low-resource domains. This approach significantly outperforms all baseline methods in the experiments on a newly collected multi-domain task-oriented semantic parsing dataset (TOPv2), which we release to the public.

扫码加入交流群

加入微信交流群

微信交流群二维码

扫码加入学术交流群,获取更多资源