通过简化句子提取零拍摄事件提取

论文标题

通过简化句子提取零拍摄事件提取

Improving Zero-Shot Event Extraction via Sentence Simplification

论文作者

Mehta, Sneha, Rangwala, Huzefa, Ramakrishnan, Naren

论文摘要

诸如Acled和我们的世界中的网站的成功表明，以新闻，社交媒体，博客和讨论论坛的形式从大量文本数据中提取结构化格式的事件的大量实用性。事件提取可以为正在进行的地缘政治危机和产生可行的情报提供一个窗口。随着大型语言模型的扩散，机器阅读理解（MRC）已成为近期事件提取的新范式。在这种方法中，事件参数提取被构架为提出问题的提问任务。基于MRC的方法的关键优势之一是其执行零弹药提取的能力。但是，远程依赖关系的问题，即触发器和参数单词之间的较大词汇距离，以及处理语法复杂的句子的难度困扰基于MRC的方法。在本文中，我们提出了一种通用方法，可以通过执行以MRC模型本身为指导的无监督句子来改善基于MRC的事件提取的性能。我们在ICEWS地缘政治事件提取数据集上评估了我们的方法，并特别注意“ Actor”和“目标”参数角色。我们展示了这样的简化如何将基于MRC的事件提取的性能提高超过5％，而对于目标提取的目标提取了10％以上。

The success of sites such as ACLED and Our World in Data have demonstrated the massive utility of extracting events in structured formats from large volumes of textual data in the form of news, social media, blogs and discussion forums. Event extraction can provide a window into ongoing geopolitical crises and yield actionable intelligence. With the proliferation of large pretrained language models, Machine Reading Comprehension (MRC) has emerged as a new paradigm for event extraction in recent times. In this approach, event argument extraction is framed as an extractive question-answering task. One of the key advantages of the MRC-based approach is its ability to perform zero-shot extraction. However, the problem of long-range dependencies, i.e., large lexical distance between trigger and argument words and the difficulty of processing syntactically complex sentences plague MRC-based approaches. In this paper, we present a general approach to improve the performance of MRC-based event extraction by performing unsupervised sentence simplification guided by the MRC model itself. We evaluate our approach on the ICEWS geopolitical event extraction dataset, with specific attention to `Actor' and `Target' argument roles. We show how such context simplification can improve the performance of MRC-based event extraction by more than 5% for actor extraction and more than 10% for target extraction.

下载PDF全文

下载文献需遵守相关版权规定

论文标题