在开放世界假设下重新思考知识图评估

论文标题

在开放世界假设下重新思考知识图评估

Rethinking Knowledge Graph Evaluation Under the Open-World Assumption

论文作者

Yang, Haotong, Lin, Zhouchen, Zhang, Muhan

论文摘要

大多数知识图（KGS）是不完整的，这激发了一个重要的研究主题，以自动补充知识图。但是，对知识图完成（KGC）模型的评估通常忽略了不完整的 - 测试集中的事实对所有未知的三胞胎进行了排名，这些三胞胎可能包含大量不包括KG中的丢失事实。将所有未知的三胞胎视为false被称为封闭世界假设。这种封闭的假设可能会对评估指标的公平性和一致性产生负面影响。在本文中，我们在更现实的环境中研究了kgc评估，即开放世界的假设，其中未知的三胞胎被认为包括许多未包含在培训或测试集中的缺失事实。对于当前最常用的指标，例如平均值等级（MRR）和HITS@K，我们指出，在开放世界假设下，它们的行为可能是出乎意料的。具体而言，由于没有太多缺失的事实，它们的数字就模型的真实强度显示出对数趋势，因此，在反映真正的模型改进方面，度量增加可能微不足道。此外，考虑到差异，我们表明报告数字的降解可能会导致不同模型之间的比较不正确，在这种模型之间，更强的模型可能具有较低的度量。我们在理论上和实验上都验证了现象。最后，我们建议解决此问题的可能原因和解决方案。我们的代码和数据可在https://github.com/graphpku/open-world-kg上找到。

Most knowledge graphs (KGs) are incomplete, which motivates one important research topic on automatically complementing knowledge graphs. However, evaluation of knowledge graph completion (KGC) models often ignores the incompleteness -- facts in the test set are ranked against all unknown triplets which may contain a large number of missing facts not included in the KG yet. Treating all unknown triplets as false is called the closed-world assumption. This closed-world assumption might negatively affect the fairness and consistency of the evaluation metrics. In this paper, we study KGC evaluation under a more realistic setting, namely the open-world assumption, where unknown triplets are considered to include many missing facts not included in the training or test sets. For the currently most used metrics such as mean reciprocal rank (MRR) and Hits@K, we point out that their behavior may be unexpected under the open-world assumption. Specifically, with not many missing facts, their numbers show a logarithmic trend with respect to the true strength of the model, and thus, the metric increase could be insignificant in terms of reflecting the true model improvement. Further, considering the variance, we show that the degradation in the reported numbers may result in incorrect comparisons between different models, where stronger models may have lower metric numbers. We validate the phenomenon both theoretically and experimentally. Finally, we suggest possible causes and solutions for this problem. Our code and data are available at https://github.com/GraphPKU/Open-World-KG .

下载PDF全文

下载文献需遵守相关版权规定

论文标题