论文标题
E-KAR:合理化自然语言类比推理的基准
E-KAR: A Benchmark for Rationalizing Natural Language Analogical Reasoning
论文作者
论文摘要
识别类比的能力是人类认知的基础。现有的测试单词类比的基准并未揭示神经模型类似推理的过程。认为有能力推理的模型应出于正确的理由应该是正确的,我们提出了一个可以解释的知识密集型类似推理基准(E-KAR)。我们的基准由1,655(中文)和1,251(英文)的问题组成,这些问题来自公务员考试,这些问题需要密集的背景知识才能解决。更重要的是,我们设计了一个自由文本解释方案,以解释是否应该绘制类比,并为每个问题和候选人的答案手动注释它们。经验结果表明,对于某些最先进的模型,用于解释生成和类比答案任务的基准非常具有挑战性,这邀请了该领域的进一步研究。
The ability to recognize analogies is fundamental to human cognition. Existing benchmarks to test word analogy do not reveal the underneath process of analogical reasoning of neural models. Holding the belief that models capable of reasoning should be right for the right reasons, we propose a first-of-its-kind Explainable Knowledge-intensive Analogical Reasoning benchmark (E-KAR). Our benchmark consists of 1,655 (in Chinese) and 1,251 (in English) problems sourced from the Civil Service Exams, which require intensive background knowledge to solve. More importantly, we design a free-text explanation scheme to explain whether an analogy should be drawn, and manually annotate them for each and every question and candidate answer. Empirical results suggest that this benchmark is very challenging for some state-of-the-art models for both explanation generation and analogical question answering tasks, which invites further research in this area.