论文标题
CODA21:评估语言理解NLP模型具有上下文定义的功能
CoDA21: Evaluating Language Understanding Capabilities of NLP Models With Context-Definition Alignment
论文作者
论文摘要
审计的语言模型(PLM)在许多基准上都实现了超人的性能,从而需要更艰难的任务。我们介绍CODA21(上下文定义对齐),这是一个具有挑战性的基准,该基准测量了PLMS的自然语言理解(NLU)功能:给定的定义和一个k单词的定义和上下文,但单词本身,任务是将K定义与K上下文保持一致。 CODA21需要对上下文和定义有深入的了解,包括复杂的推论和世界知识。我们发现人与PLM性能之间存在很大的差距,这表明CODA21测量了NLU的一个方面,而现有基准测试中没有足够的覆盖。
Pretrained language models (PLMs) have achieved superhuman performance on many benchmarks, creating a need for harder tasks. We introduce CoDA21 (Context Definition Alignment), a challenging benchmark that measures natural language understanding (NLU) capabilities of PLMs: Given a definition and a context each for k words, but not the words themselves, the task is to align the k definitions with the k contexts. CoDA21 requires a deep understanding of contexts and definitions, including complex inference and world knowledge. We find that there is a large gap between human and PLM performance, suggesting that CoDA21 measures an aspect of NLU that is not sufficiently covered in existing benchmarks.