未来的视线：具有大型语言模型的动态故事产生

论文标题

未来的视线：具有大型语言模型的动态故事产生

Future Sight: Dynamic Story Generation with Large Pretrained Language Models

论文作者

Zimmerman, Brian D., Sahu, Gaurav, Vechtomova, Olga

论文摘要

深度学习研究（例如变形金刚）的最新进展增强了自动化代理人产生与人类所写的文本相似的创造性文本的能力。默认情况下，变压器解码器只能就先前生成的文本生成新文本。候选令牌在任何位置的输出分布都采用先前选择的代币，使用自我发项机制模仿自动性的特性。对于可控的故事生成等任务而言，这本质上是限制的，在撰写故事时，可能有必要在未来的情节事件中进行调节。在这项工作中，我们提出了未来的景点，这是一种对未来调节任务进行预审计的生成变压器的方法。通常，通过自我注意力，变压器解码器一次仔细研究了一个使上下文，一个令牌。未来的视线还使解码器能够参加编码的未来情节活动。这激发了解码器以逻辑上以提供的未来结论的方式扩展上下文。在推论期间，未来的情节事件可以由人类作者写成，以指导朝着一定方向产生的叙述。我们评估了与人类评估人员有关故事生成任务的方法的功效。

Recent advances in deep learning research, such as transformers, have bolstered the ability for automated agents to generate creative texts similar to those that a human would write. By default, transformer decoders can only generate new text with respect to previously generated text. The output distribution of candidate tokens at any position is conditioned on previously selected tokens using a self-attention mechanism to emulate the property of autoregression. This is inherently limiting for tasks such as controllable story generation where it may be necessary to condition on future plot events when writing a story. In this work, we propose Future Sight, a method for finetuning a pretrained generative transformer on the task of future conditioning. Transformer decoders are typically pretrained on the task of completing a context, one token at a time, by means of self-attention. Future Sight additionally enables a decoder to attend to an encoded future plot event. This motivates the decoder to expand on the context in a way that logically concludes with the provided future. During inference, the future plot event can be written by a human author to steer the narrative being generated in a certain direction. We evaluate the efficacy of our approach on a story generation task with human evaluators.

下载PDF全文

下载文献需遵守相关版权规定

论文标题