论文标题
评估基于Web的数据丰富的伪交易反馈检索的元素
Evaluating Elements of Web-based Data Enrichment for Pseudo-Relevance Feedback Retrieval
论文作者
论文摘要
在这项工作中,我们根据Web搜索引擎的结果分析了一种伪相关检索方法。通过使用Web搜索引擎结果页面和链接内容的文本数据丰富主题,我们可以培训特定于主题和成本效益的分类器,这些分类器可用于搜索相关文档的测试集。基于Grossman和Cormack最初在TREC COONCOR CORE 2018进行的尝试,我们考虑了随着时间的推移,考虑到不同的搜索引擎,查询和测试收集的问题。我们的实验结果表明,所考虑的组件如何以及在何种程度上影响检索性能。总体而言,分析方法在平均检索性能和使用Web内容的一种有希望的方法来丰富相关反馈方法方面是强大的。
In this work, we analyze a pseudo-relevance retrieval method based on the results of web search engines. By enriching topics with text data from web search engine result pages and linked contents, we train topic-specific and cost-efficient classifiers that can be used to search test collections for relevant documents. Building upon attempts initially made at TREC Common Core 2018 by Grossman and Cormack, we address questions of system performance over time considering different search engines, queries, and test collections. Our experimental results show how and to which extent the considered components affect the retrieval performance. Overall, the analyzed method is robust in terms of average retrieval performance and a promising way to use web content for the data enrichment of relevance feedback methods.