论文标题
通过基于能量的推理网络对任意订单序列标记的探索
An Exploration of Arbitrary-Order Sequence Labeling via Energy-Based Inference Networks
论文作者
论文摘要
自然语言处理中的许多任务涉及预测结构化输出,例如序列标记,语义角色标记,解析和机器翻译。研究人员越来越多地将深度表示学习应用于这些问题,但是这些方法的结构化组成部分通常非常简单。在这项工作中,我们提出了几个高级能量项,以捕获序列标签中标签之间的复杂依赖性,其中包括几种考虑整个标签序列的依赖性。我们将神经参数化用于这些能量术语,从卷积,经常性和自我发项网络中借鉴。我们使用学习基于能量的推理网络的框架(Tu和Gimpel,2018)来应对培训和推断此类模型的困难。我们从经验上证明,这种方法使用四个序列标记任务上的各种高级能量项实现了重大改进,同时具有与简单的本地分类器相同的解码速度。我们还发现高阶能量可以在嘈杂的数据条件下提供帮助。
Many tasks in natural language processing involve predicting structured outputs, e.g., sequence labeling, semantic role labeling, parsing, and machine translation. Researchers are increasingly applying deep representation learning to these problems, but the structured component of these approaches is usually quite simplistic. In this work, we propose several high-order energy terms to capture complex dependencies among labels in sequence labeling, including several that consider the entire label sequence. We use neural parameterizations for these energy terms, drawing from convolutional, recurrent, and self-attention networks. We use the framework of learning energy-based inference networks (Tu and Gimpel, 2018) for dealing with the difficulties of training and inference with such models. We empirically demonstrate that this approach achieves substantial improvement using a variety of high-order energy terms on four sequence labeling tasks, while having the same decoding speed as simple, local classifiers. We also find high-order energies to help in noisy data conditions.