加强战略建议学习

论文标题

加强战略建议学习

Reinforcement Learning for Strategic Recommendations

论文作者

Theocharous, Georgios, Chandak, Yash, Thomas, Philip S., de Nijs, Frits

论文摘要

战略建议（SR）是指智能代理人观察用户的顺序行为和活动的问题，并决定何时以及如何与他们互动以优化用户和企业的一些长期目标。这些系统仍处于行业阶段，需要解决一些基本研究挑战的实用解决方案。在Adobe Research中，我们一直在为各种用例实施此类系统，包括关注点建议，教程建议，多媒体编辑软件的下一步指导以及用于优化终身价值的AD建议。在建立这些系统时，存在许多研究挑战，例如对用户的顺序行为进行建模，决定何时干预并提供建议而不惹恼用户，以高信心，安全的部署，非平稳性，从无源数据中构建系统，从不容纳过去的建议中，从而使多users Systems的资源构成优化，对大型和动态的人类和动态范围进行缩放，并结合范围和动态范围。在本文中，我们涵盖了我们解决的各种用例和研究挑战，以使这些系统实用。

Strategic recommendations (SR) refer to the problem where an intelligent agent observes the sequential behaviors and activities of users and decides when and how to interact with them to optimize some long-term objectives, both for the user and the business. These systems are in their infancy in the industry and in need of practical solutions to some fundamental research challenges. At Adobe research, we have been implementing such systems for various use-cases, including points of interest recommendations, tutorial recommendations, next step guidance in multi-media editing software, and ad recommendation for optimizing lifetime value. There are many research challenges when building these systems, such as modeling the sequential behavior of users, deciding when to intervene and offer recommendations without annoying the user, evaluating policies offline with high confidence, safe deployment, non-stationarity, building systems from passive data that do not contain past recommendations, resource constraint optimization in multi-user systems, scaling to large and dynamic actions spaces, and handling and incorporating human cognitive biases. In this paper we cover various use-cases and research challenges we solved to make these systems practical.

下载PDF全文

下载文献需遵守相关版权规定

论文标题