通过抽象文本摘要的多发入学习的显着估计

论文标题

通过抽象文本摘要的多发入学习的显着估计

Salience Estimation with Multi-Attention Learning for Abstractive Text Summarization

论文作者

Li, Piji, Bing, Lidong, Wei, Zhongyu, Lam, Wai

论文摘要

注意机制在序列生成模型中起主要作用，并已用于改善机器翻译和抽象文本摘要的性能。与神经机器翻译不同，在文本摘要的任务中，单词，短语或句子的显着性估计是关键的组成部分，因为输出摘要是对输入文本的蒸馏。尽管典型的注意机制可以从解码器状态下的输入文本中进行文本片段选择，但仍有一个差距来进行直接有效的显着性检测。为了带回神经网络的直接显着性估计，我们提出了一个多发的学习框架，其中包含两个新的注意力学习组件，以进行显着估计：监督注意力学习和无监督的注意力学习。我们将注意力重量视为显着信息，这意味着具有较大注意力价值的语义单元将更为重要。基于估计的显着性获得的上下文信息与解码器中的典型注意机制合并，以进行摘要生成。在某些基准数据集上使用不同语言进行的广泛实验证明了提出的框架在抽象摘要任务中的有效性。

Attention mechanism plays a dominant role in the sequence generation models and has been used to improve the performance of machine translation and abstractive text summarization. Different from neural machine translation, in the task of text summarization, salience estimation for words, phrases or sentences is a critical component, since the output summary is a distillation of the input text. Although the typical attention mechanism can conduct text fragment selection from the input text conditioned on the decoder states, there is still a gap to conduct direct and effective salience detection. To bring back direct salience estimation for summarization with neural networks, we propose a Multi-Attention Learning framework which contains two new attention learning components for salience estimation: supervised attention learning and unsupervised attention learning. We regard the attention weights as the salience information, which means that the semantic units with large attention value will be more important. The context information obtained based on the estimated salience is incorporated with the typical attention mechanism in the decoder to conduct summary generation. Extensive experiments on some benchmark datasets in different languages demonstrate the effectiveness of the proposed framework for the task of abstractive summarization.

下载PDF全文

下载文献需遵守相关版权规定

论文标题