论文标题
两阶段推荐系统中公平性的不确定性量化
Uncertainty Quantification for Fairness in Two-Stage Recommender Systems
论文作者
论文摘要
许多大规模推荐系统由两个阶段组成。第一阶段有效地筛选了一小部分有希望的候选人的完整项目池,第二阶段模型从中策划了最终建议。在本文中,我们研究了如何确保此两阶段架构中项目的群体公平性。特别是,我们发现现有的第一阶段推荐人可能会选择一套不公平的候选人集,因此对于第二阶段的推荐人没有希望提供公平建议的希望。为此,在不确定性量化的最新进展的推动下,我们提出了两个阈值 - 政策选择规则,可以为第一阶段推荐人提供公平性提供无分配和有限样本的保证。更具体地说,鉴于查询和项目的任何相关模型,以及每个阈值派利的相关项目数量的较低的置信度,这两个规则发现,近乎最佳的候选人集,这些候选人在每组项目中都包含足够的相关项目。为了实例化规则,我们演示了如何从潜在的部分和有偏见的用户反馈数据中得出这样的置信度,这些数据在许多大规模推荐系统中都很丰富。此外,我们还提供有限样本和渐近分析,分析了两个阈值选择规则与最佳阈值的距离。除了进行理论分析之外,我们从经验上表明,这两个规则可以从每个组中始终如一地选择足够的相关项目,同时最大程度地减少各种设置的候选设置的大小。
Many large-scale recommender systems consist of two stages. The first stage efficiently screens the complete pool of items for a small subset of promising candidates, from which the second-stage model curates the final recommendations. In this paper, we investigate how to ensure group fairness to the items in this two-stage architecture. In particular, we find that existing first-stage recommenders might select an irrecoverably unfair set of candidates such that there is no hope for the second-stage recommender to deliver fair recommendations. To this end, motivated by recent advances in uncertainty quantification, we propose two threshold-policy selection rules that can provide distribution-free and finite-sample guarantees on fairness in first-stage recommenders. More concretely, given any relevance model of queries and items and a point-wise lower confidence bound on the expected number of relevant items for each threshold-policy, the two rules find near-optimal sets of candidates that contain enough relevant items in expectation from each group of items. To instantiate the rules, we demonstrate how to derive such confidence bounds from potentially partial and biased user feedback data, which are abundant in many large-scale recommender systems. In addition, we provide both finite-sample and asymptotic analyses of how close the two threshold selection rules are to the optimal thresholds. Beyond this theoretical analysis, we show empirically that these two rules can consistently select enough relevant items from each group while minimizing the size of the candidate sets for a wide range of settings.