一项关于识别文本需要作为NLP评估的调查

论文标题

一项关于识别文本需要作为NLP评估的调查

A Survey on Recognizing Textual Entailment as an NLP Evaluation

论文作者

Poliak, Adam

论文摘要

建议识别文本元素（RTE）作为统一的评估框架，以比较对不同NLP系统的语义理解。在本调查文件中，我们提供了评估和理解NLP系统推理能力的不同方法的概述。然后，我们通过强调突出的RTE数据集以及RTE数据集中的进步来重点关注RTE，该数据集中着眼于特定的语言现象，可用于以细粒度级别评估NLP系统。我们结论说，在评估NLP系统时，社区应利用新引入的RTE数据集，该数据集专注于特定的语言现象。

Recognizing Textual Entailment (RTE) was proposed as a unified evaluation framework to compare semantic understanding of different NLP systems. In this survey paper, we provide an overview of different approaches for evaluating and understanding the reasoning capabilities of NLP systems. We then focus our discussion on RTE by highlighting prominent RTE datasets as well as advances in RTE dataset that focus on specific linguistic phenomena that can be used to evaluate NLP systems on a fine-grained level. We conclude by arguing that when evaluating NLP systems, the community should utilize newly introduced RTE datasets that focus on specific linguistic phenomena.

下载PDF全文

下载文献需遵守相关版权规定

论文标题