论文标题
逆约束增强学习
Inverse Constrained Reinforcement Learning
论文作者
论文摘要
在现实世界中,存在许多限制,这些约束很难在数学上指定。但是,对于现实世界的增强学习(RL)的部署,RL代理人意识到这些约束至关重要,以便他们可以安全地采取行动。在这项工作中,我们考虑了从守护者的行为的演示中学习约束的问题。我们在实验中验证了我们的方法,并表明我们的框架可以成功地学习代理人尊重的最可能的约束。我们进一步表明,这些学习的约束是\ textit {可转移}到可能具有不同形态和/或奖励功能的新代理。这方面的先前工作主要仅限于表格(离散)设置,特定类型的约束,或者假设环境的过渡动态。相比之下,我们的框架能够在完全无模型的设置中学习任意\ textit {Markovian}的约束。可以找到代码:\ url {https://github.com/shehryar-malik/icrl}。
In real world settings, numerous constraints are present which are hard to specify mathematically. However, for the real world deployment of reinforcement learning (RL), it is critical that RL agents are aware of these constraints, so that they can act safely. In this work, we consider the problem of learning constraints from demonstrations of a constraint-abiding agent's behavior. We experimentally validate our approach and show that our framework can successfully learn the most likely constraints that the agent respects. We further show that these learned constraints are \textit{transferable} to new agents that may have different morphologies and/or reward functions. Previous works in this regard have either mainly been restricted to tabular (discrete) settings, specific types of constraints or assume the environment's transition dynamics. In contrast, our framework is able to learn arbitrary \textit{Markovian} constraints in high-dimensions in a completely model-free setting. The code can be found it: \url{https://github.com/shehryar-malik/icrl}.