指定实体识别的可解释性分析，以了解系统预测及其如何改进

论文标题

指定实体识别的可解释性分析，以了解系统预测及其如何改进

Interpretability Analysis for Named Entity Recognition to Understand System Predictions and How They Can Improve

论文作者

Agarwal, Oshin, Yang, Yinfei, Wallace, Byron C., Nenkova, Ani

论文摘要

命名的实体识别系统在诸如英语新闻之类的领域中取得了显着的性能。很自然地问：这些模型实际上正在学习实现这一目标？他们只是在记住自己的名字本身吗？还是他们能够解释文本并从语言环境中推断正确的实体类型？我们通过将LSTM-CRF体系结构的几种变体的性能进行对比，以对这些问题进行对比，其中一些仅提供了上下文作为特征的表示。我们还对BERT进行了类似的实验。我们发现上下文表示确实有助于系统性能，但是推动高性能的主要因素是学习名称代币本身。我们邀请人类注释者仅从上下文中评估实体类型的可行性，并发现，尽管人们无法推断出实体类型的大多数错误，但仅上下文系统造成的错误，但仍有一些改进的余地。系统应能够正确识别预测性上下文中的任何名称，我们的实验表明，这种能力可以进一步改善当前系统。

Named Entity Recognition systems achieve remarkable performance on domains such as English news. It is natural to ask: What are these models actually learning to achieve this? Are they merely memorizing the names themselves? Or are they capable of interpreting the text and inferring the correct entity type from the linguistic context? We examine these questions by contrasting the performance of several variants of LSTM-CRF architectures for named entity recognition, with some provided only representations of the context as features. We also perform similar experiments for BERT. We find that context representations do contribute to system performance, but that the main factor driving high performance is learning the name tokens themselves. We enlist human annotators to evaluate the feasibility of inferring entity types from the context alone and find that, while people are not able to infer the entity type either for the majority of the errors made by the context-only system, there is some room for improvement. A system should be able to recognize any name in a predictive context correctly and our experiments indicate that current systems may be further improved by such capability.

下载PDF全文

下载文献需遵守相关版权规定

论文标题