中文抽象摘要的词典受限的复制网络

论文标题

中文抽象摘要的词典受限的复制网络

Lexicon-constrained Copying Network for Chinese Abstractive Summarization

论文作者

Wan, Boyan, Sohail, Mishal

论文摘要

复制机制允许序列到序列模型从输入中选择单词并将其直接放入输出中，这正在发现抽象性摘要中的使用越来越多。但是，由于中文句子中没有明确的定界符，因此中文抽象摘要的大多数现有模型只能执行字符副本，从而导致效率低下。为了解决这个问题，我们提出了一个词典受限的复制网络，该网络对编码器和解码器中的多晶格进行了建模。在源方面，使用基于变压器的编码器将单词和字符汇总到相同的输入内存中。在目标方面，解码器可以在每个时间步骤中复制字符或多字符单词，并且解码过程以单词增强的搜索算法为指导，该搜索算法有助于并行计算，并鼓励模型复制更多单词。此外，我们采用一个单词选择器来集成关键字信息。中国社交媒体数据集中的实验结果表明，我们的模型可以独立工作或使用“选择器”一词。两种表格都可以胜过以前的基于角色的模型并实现竞争性能。

Copy mechanism allows sequence-to-sequence models to choose words from the input and put them directly into the output, which is finding increasing use in abstractive summarization. However, since there is no explicit delimiter in Chinese sentences, most existing models for Chinese abstractive summarization can only perform character copy, resulting in inefficient. To solve this problem, we propose a lexicon-constrained copying network that models multi-granularity in both encoder and decoder. On the source side, words and characters are aggregated into the same input memory using a Transformerbased encoder. On the target side, the decoder can copy either a character or a multi-character word at each time step, and the decoding process is guided by a word-enhanced search algorithm that facilitates the parallel computation and encourages the model to copy more words. Moreover, we adopt a word selector to integrate keyword information. Experiments results on a Chinese social media dataset show that our model can work standalone or with the word selector. Both forms can outperform previous character-based models and achieve competitive performances.

下载PDF全文

下载文献需遵守相关版权规定

论文标题