段落级别的常识变压器具有经常性记忆

论文标题

段落级别的常识变压器具有经常性记忆

Paragraph-level Commonsense Transformers with Recurrent Memory

论文作者

Gabriel, Saadia, Bhagavatula, Chandra, Shwartz, Vered, Bras, Ronan Le, Forbes, Maxwell, Choi, Yejin

论文摘要

人类对叙事文本的理解需要提出常识性推断，超出了文本中明确说明的内容。最近的模型Comet可以沿几个维度产生这种隐式常识推论，例如参与者的条件前后，动机和精神状态。但是，彗星接受了简短措施的常识推论，因此是话语 - 敏捷的训练。当呈现多句叙事的每个句子时，它可能会产生与其余叙述不一致的推论。我们介绍了话语意识到的常识推论的任务。鉴于叙事中的句子，目标是在预定义的维度上产生常识推论，同时与其余的叙述保持一致。如此大规模的段落级注释很难获得和昂贵，因此我们使用可用的句子级注释有效，自动构建遥远监督的语料库。使用此语料库，我们训练Para-comet，这是一种语言感知的模型，该模型结合了段落级别的信息，以从叙事中产生连贯的常识性推断。 Para-comet捕获了与先前的世界知识有关的语义知识，又捕获了涉及时事事件与叙事中的先前和未来事件的关系的情节知识。我们的结果表明，Para-comp的表现优于句子级基线，尤其是在产生既连贯又新颖的推论方面。

Human understanding of narrative texts requires making commonsense inferences beyond what is stated explicitly in the text. A recent model, COMET, can generate such implicit commonsense inferences along several dimensions such as pre- and post-conditions, motivations, and mental states of the participants. However, COMET was trained on commonsense inferences of short phrases, and is therefore discourse-agnostic. When presented with each sentence of a multi-sentence narrative, it might generate inferences that are inconsistent with the rest of the narrative. We present the task of discourse-aware commonsense inference. Given a sentence within a narrative, the goal is to generate commonsense inferences along predefined dimensions, while maintaining coherence with the rest of the narrative. Such large-scale paragraph-level annotation is hard to get and costly, so we use available sentence-level annotations to efficiently and automatically construct a distantly supervised corpus. Using this corpus, we train PARA-COMET, a discourse-aware model that incorporates paragraph-level information to generate coherent commonsense inferences from narratives. PARA-COMET captures both semantic knowledge pertaining to prior world knowledge, and episodic knowledge involving how current events relate to prior and future events in a narrative. Our results show that PARA-COMET outperforms the sentence-level baselines, particularly in generating inferences that are both coherent and novel.

下载PDF全文

下载文献需遵守相关版权规定

论文标题