论文标题

对强化学习中的非政策评估的评论

A Review of Off-Policy Evaluation in Reinforcement Learning

论文作者

Uehara, Masatoshi, Shi, Chengchun, Kallus, Nathan

论文摘要

增强学习(RL)是机器学习中最活跃的研究前沿之一,最近已应用于解决许多具有挑战性的问题。在本文中,我们主要专注于非政策评估(OPE),这是RL中最基本的主题之一。近年来,统计和计算机科学文献中已经开发了许多OPE方法。我们提供了有关OPE效率界限的讨论,一些现有的最新OPE方法,其统计属性以及当前正在积极探索的其他相关研究方向。

Reinforcement learning (RL) is one of the most vibrant research frontiers in machine learning and has been recently applied to solve a number of challenging problems. In this paper, we primarily focus on off-policy evaluation (OPE), one of the most fundamental topics in RL. In recent years, a number of OPE methods have been developed in the statistics and computer science literature. We provide a discussion on the efficiency bound of OPE, some of the existing state-of-the-art OPE methods, their statistical properties and some other related research directions that are currently actively explored.

扫码加入交流群

加入微信交流群

微信交流群二维码

扫码加入学术交流群,获取更多资源