论文标题

关于文档级信息提取的事件个性化

On Event Individuation for Document-Level Information Extraction

论文作者

Gantt, William, Kriz, Reno, Chen, Yunmo, Vashishtha, Siddharth, White, Aaron Steven

论文摘要

随着信息提取(IE)系统越来越擅长处理整个文档,因此,经典的模板填充任务已将人们的兴趣作为文档级别IE的基准进行了重新兴趣。在此职位论文中,我们质疑为此目的填充模板填充的适用性。我们认为,任务需要确定的答案,以解决事件个体化的棘手问题 - 区分不同事件的问题 - 甚至人类专家都不同意。通过注释研究和错误分析,我们表明,这引起了人们对模板填充指标的实用性的担忧,任务的数据集质量以及模型学习的能力。最后,我们考虑可能的解决方案。

As information extraction (IE) systems have grown more adept at processing whole documents, the classic task of template filling has seen renewed interest as benchmark for document-level IE. In this position paper, we call into question the suitability of template filling for this purpose. We argue that the task demands definitive answers to thorny questions of event individuation -- the problem of distinguishing distinct events -- about which even human experts disagree. Through an annotation study and error analysis, we show that this raises concerns about the usefulness of template filling metrics, the quality of datasets for the task, and the ability of models to learn it. Finally, we consider possible solutions.

扫码加入交流群

加入微信交流群

微信交流群二维码

扫码加入学术交流群,获取更多资源