论文标题
生成性多跳回检索
Generative Multi-hop Retrieval
论文作者
论文摘要
文本检索的一种常见实践是使用编码器将文档和查询映射到公共向量空间并执行最近的邻居搜索(NNS);多跳检索也经常采用相同的范式,通常会通过迭代重新设计查询向量进行修改,以便它可以在每个跳跃中检索不同的文档。但是,这种双重编码方法在多跳环设置中存在局限性。 (1)随着啤酒花数量的增加,重新计算的查询会更长,这进一步收紧了查询矢量的嵌入式瓶颈,并且(2)容易出现误差传播。在本文中,我们专注于以完全生成的方式提出问题,从而减轻多跳环境中的这些限制。我们提出了一个编码器模型,该模型通过简单地生成检索目标的整个文本序列来执行多跳检索,这意味着查询和文档在语言模型的参数空间中而不是内部乘积空间与Bi-编码器方法相互作用。我们的方法,生成的多跳回检索(GMR),在五个数据集中的双编码器模型中始终取得可比或更高的性能,同时展示了出色的GPU存储器和存储足迹。
A common practice for text retrieval is to use an encoder to map the documents and the query to a common vector space and perform a nearest neighbor search (NNS); multi-hop retrieval also often adopts the same paradigm, usually with a modification of iteratively reformulating the query vector so that it can retrieve different documents at each hop. However, such a bi-encoder approach has limitations in multi-hop settings; (1) the reformulated query gets longer as the number of hops increases, which further tightens the embedding bottleneck of the query vector, and (2) it is prone to error propagation. In this paper, we focus on alleviating these limitations in multi-hop settings by formulating the problem in a fully generative way. We propose an encoder-decoder model that performs multi-hop retrieval by simply generating the entire text sequences of the retrieval targets, which means the query and the documents interact in the language model's parametric space rather than L2 or inner product space as in the bi-encoder approach. Our approach, Generative Multi-hop Retrieval(GMR), consistently achieves comparable or higher performance than bi-encoder models in five datasets while demonstrating superior GPU memory and storage footprint.