论文标题
复杂NLP在变压器中的作用在文本排名中?
The Role of Complex NLP in Transformers for Text Ranking?
论文作者
论文摘要
即使基于术语的方法(例如BM25)在排名方面提供了强大的基准,但在某些条件下,它们由大型预训练的蒙版语言模型(MLMS)(例如BERT)主导。迄今为止,其有效性的来源尚不清楚。他们是通过建模句法方面真正理解含义的能力吗?我们通过以破坏查询和通道的自然序列顺序来操纵输入顺序和位置信息来回答这一点,并表明该模型仍然可以实现可比性的性能。总体而言,我们的结果表明,句法方面在与BERT重新排行的有效性中没有关键作用。我们指出了其他机制,例如查询通行的跨注意事项和更丰富的嵌入,这些机制是基于汇总上下文捕获单词含义的,而不管是订单词的主要属性,无论是其出色表现的主要归因。
Even though term-based methods such as BM25 provide strong baselines in ranking, under certain conditions they are dominated by large pre-trained masked language models (MLMs) such as BERT. To date, the source of their effectiveness remains unclear. Is it their ability to truly understand the meaning through modeling syntactic aspects? We answer this by manipulating the input order and position information in a way that destroys the natural sequence order of query and passage and shows that the model still achieves comparable performance. Overall, our results highlight that syntactic aspects do not play a critical role in the effectiveness of re-ranking with BERT. We point to other mechanisms such as query-passage cross-attention and richer embeddings that capture word meanings based on aggregated context regardless of the word order for being the main attributions for its superior performance.