使用变压器评估密集通道检索

论文标题

使用变压器评估密集通道检索

Evaluating Dense Passage Retrieval using Transformers

论文作者

Sadri, Nima

论文摘要

尽管在过去几年中，基于变压器的代表性检索模型已经能够取得重大进步，尽管经过广泛接受的惯例和测试此类模型的最佳实践，但尚未开发一个$ \ textIt {标准化} $评估框架。在这项工作中，我们将文献研究人员的最佳实践和惯例正式化，为更标准化的评估铺平了道路，因此在模型之间进行了更公平的比较。我们的框架（1）嵌入了文档和查询；（2）对于每个查询文档对，根据文档的点产物和查询嵌入来计算相关得分；（3）使用MSMARCO数据集的$ \ texttt {dev} $集来评估模型；（4）使用$ \ texttt {trec_eval} $脚本来计算MRR@100，这是用于评估模型的主要度量。最重要的是，我们通过在一些最著名的密集检索模型上进行实验来展示此框架的使用。

Although representational retrieval models based on Transformers have been able to make major advances in the past few years, and despite the widely accepted conventions and best-practices for testing such models, a $\textit{standardized}$ evaluation framework for testing them has not been developed. In this work, we formalize the best practices and conventions followed by researchers in the literature, paving the path for more standardized evaluations - and therefore more fair comparisons between the models. Our framework (1) embeds the documents and queries; (2) for each query-document pair, computes the relevance score based on the dot product of the document and query embedding; (3) uses the $\texttt{dev}$ set of the MSMARCO dataset to evaluate the models; (4) uses the $\texttt{trec_eval}$ script to calculate MRR@100, which is the primary metric used to evaluate the models. Most importantly, we showcase the use of this framework by experimenting on some of the most well-known dense retrieval models.

下载PDF全文

下载文献需遵守相关版权规定

论文标题