论文标题
遗憾保证的调查匪徒
Survey Bandits with Regret Guarantees
论文作者
论文摘要
我们考虑上下文匪徒问题的变体。在标准上下文的强盗中,当用户到达时,我们会获得用户的完整功能向量,然后为该用户分配治疗(ARM)。在许多应用程序(例如医疗保健)中,从用户那里收集功能可能是昂贵的。为了解决这个问题,我们提出算法,以避免不必要的功能收集,同时保持强烈的遗憾保证。
We consider a variant of the contextual bandit problem. In standard contextual bandits, when a user arrives we get the user's complete feature vector and then assign a treatment (arm) to that user. In a number of applications (like healthcare), collecting features from users can be costly. To address this issue, we propose algorithms that avoid needless feature collection while maintaining strong regret guarantees.