通过准蒙特卡洛抽样中的Plackett-luce模型中的低变化估计

论文标题

通过准蒙特卡洛抽样中的Plackett-luce模型中的低变化估计

Low-variance estimation in the Plackett-Luce model via quasi-Monte Carlo sampling

论文作者

Buchholz, Alexander, Lichtenberg, Jan Malte, Di Benedetto, Giuseppe, Stein, Yannik, Bellini, Vito, Ruffini, Matteo

论文摘要

Plackett-luce（PL）模型在学习到秩（LTR）中无处不在，因为它为采样排名列表提供了有用的直观概率模型。对使用LTR方法在生产中使用LTR方法的反事实离线评估和排名指标的优化至关重要。在采用PL模型作为排名策略时，这两个任务都需要对模型的期望进行计算。这些通常是通过蒙特卡洛（MC）采样近似的，因为要排名的项目数量中的组合缩放比例使它们的分析计算变得棘手。尽管最近通过Gumbel Top-K技巧提高了采样过程的计算效率，但MC估计值可能会遭受较高的差异。我们通过将Gumbel Top-K Track与准蒙特卡洛（QMC）采样（一种降低方差的技术降低技术）相结合，从而开发了一种新型的方法，可以通过将Gumbel Top-K Track与准蒙特卡洛（QMC）采样相结合，从而产生PL模型中更有效的预期估计量。我们使用亚马逊音乐和Yahoo学习对挑战的现实世界推荐数据在理论和经验上说明了我们的发现。

The Plackett-Luce (PL) model is ubiquitous in learning-to-rank (LTR) because it provides a useful and intuitive probabilistic model for sampling ranked lists. Counterfactual offline evaluation and optimization of ranking metrics are pivotal for using LTR methods in production. When adopting the PL model as a ranking policy, both tasks require the computation of expectations with respect to the model. These are usually approximated via Monte-Carlo (MC) sampling, since the combinatorial scaling in the number of items to be ranked makes their analytical computation intractable. Despite recent advances in improving the computational efficiency of the sampling process via the Gumbel top-k trick, the MC estimates can suffer from high variance. We develop a novel approach to producing more sample-efficient estimators of expectations in the PL model by combining the Gumbel top-k trick with quasi-Monte Carlo (QMC) sampling, a well-established technique for variance reduction. We illustrate our findings both theoretically and empirically using real-world recommendation data from Amazon Music and the Yahoo learning-to-rank challenge.

下载PDF全文

下载文献需遵守相关版权规定

论文标题