CODA21：评估语言理解NLP模型具有上下文定义的功能

论文标题

CODA21：评估语言理解NLP模型具有上下文定义的功能

CoDA21: Evaluating Language Understanding Capabilities of NLP Models With Context-Definition Alignment

论文作者

Senel, Lütfi Kerem, Schick, Timo, Schütze, Hinrich

论文摘要

审计的语言模型（PLM）在许多基准上都实现了超人的性能，从而需要更艰难的任务。我们介绍CODA21（上下文定义对齐），这是一个具有挑战性的基准，该基准测量了PLMS的自然语言理解（NLU）功能：给定的定义和一个k单词的定义和上下文，但单词本身，任务是将K定义与K上下文保持一致。 CODA21需要对上下文和定义有深入的了解，包括复杂的推论和世界知识。我们发现人与PLM性能之间存在很大的差距，这表明CODA21测量了NLU的一个方面，而现有基准测试中没有足够的覆盖。

Pretrained language models (PLMs) have achieved superhuman performance on many benchmarks, creating a need for harder tasks. We introduce CoDA21 (Context Definition Alignment), a challenging benchmark that measures natural language understanding (NLU) capabilities of PLMs: Given a definition and a context each for k words, but not the words themselves, the task is to align the k definitions with the k contexts. CoDA21 requires a deep understanding of contexts and definitions, including complex inference and world knowledge. We find that there is a large gap between human and PLM performance, suggesting that CoDA21 measures an aspect of NLU that is not sufficiently covered in existing benchmarks.

下载PDF全文

下载文献需遵守相关版权规定

论文标题