论文标题
总结,概述和详细:通过层次的层次监督从提取性摘要中产生长文本
Summarize, Outline, and Elaborate: Long-Text Generation via Hierarchical Supervision from Extractive Summaries
论文作者
论文摘要
产生连贯的长文本的困难在于,现有模型绝大多数集中在预测本地单词上,并且无法制定高级计划,以生成或捕获文本块之间的高级话语依赖性。 Inspired by human writing processes, where a list of bullet points or a catalog is first outlined, and then each bullet point is expanded to form the whole article, we propose {\it SOE}, a pipelined system that involves of summarizing, outlining and elaborating for long text generation: the model first outlines the summaries for different segments of long texts, and then elaborates on each bullet point to generate the corresponding segment.为了避免劳动密集型的摘要征集过程,我们提出了{\ it重建}策略,该策略通过选择其最有用的部分来重建该细分市场,以无监督的方式提取细分市场摘要。提出的生成系统具有以下优点:(1)摘要为文本生成提供了高级指导,并避免了当地的单个单词预测的最低限度; (2)高级话语依赖性是在摘要之间的条件依赖性中捕获的,并在摘要扩展过程中保留,(3)此外,我们还可以通过将上下文表示为简洁的摘要来考虑更多的上下文。广泛的实验表明,SOE产生的长文本质量明显更高,并且收敛速度更快。
The difficulty of generating coherent long texts lies in the fact that existing models overwhelmingly focus on predicting local words, and cannot make high level plans on what to generate or capture the high-level discourse dependencies between chunks of texts. Inspired by human writing processes, where a list of bullet points or a catalog is first outlined, and then each bullet point is expanded to form the whole article, we propose {\it SOE}, a pipelined system that involves of summarizing, outlining and elaborating for long text generation: the model first outlines the summaries for different segments of long texts, and then elaborates on each bullet point to generate the corresponding segment. To avoid the labor-intensive process of summary soliciting, we propose the {\it reconstruction} strategy, which extracts segment summaries in an unsupervised manner by selecting its most informative part to reconstruct the segment. The proposed generation system comes with the following merits: (1) the summary provides high-level guidance for text generation and avoids the local minimum of individual word predictions; (2) the high-level discourse dependencies are captured in the conditional dependencies between summaries and are preserved during the summary expansion process and (3) additionally, we are able to consider significantly more contexts by representing contexts as concise summaries. Extensive experiments demonstrate that SOE produces long texts with significantly better quality, along with faster convergence speed.