论文标题
解释自己的决策:以用户为中心的深入强化学习解释系统
Decisions that Explain Themselves: A User-Centric Deep Reinforcement Learning Explanation System
论文作者
论文摘要
随着深度加强学习(RL)的系统,例如自主驾驶被疯狂部署,但在很大程度上是不透明的,开发人员经常使用可解释的RL(XRL)工具来更好地理解和与Deep RL代理一起使用。但是,以前的XRL Works采用了一种以技术为中心的研究方法,忽略了RL开发人员如何看待生成的解释。通过一项试点研究,我们确定了RL从业者使用XRL方法和四个陷阱的主要目标,从而扩大了现有XRL方法与这些目标之间的差距。陷阱包括无法访问的推理过程,不一致或难以理解的解释以及无法概括的解释。为了填补发现的差距,我们提出了一种基于反事实的解释方法,该方法发现了RL代理的推理过程的细节并生成自然语言解释。围绕此方法,我们构建了一个交互式XRL系统,用户可以在其中积极探索说明和有影响力的信息。在与14名参与者的用户研究中,我们验证了开发人员与基线方法相比,使用系统的RL代理的异常行为和局限性多20.9%,并且使用我们的系统帮助最终用户在自动驾驶任务中的行动性测试中的性能提高了25.1%,而在Starcraft II Micromansmaphiphation中,使用了16.9%。
With deep reinforcement learning (RL) systems like autonomous driving being wildly deployed but remaining largely opaque, developers frequently use explainable RL (XRL) tools to better understand and work with deep RL agents. However, previous XRL works employ a techno-centric research approach, ignoring how RL developers perceive the generated explanations. Through a pilot study, we identify major goals for RL practitioners to use XRL methods and four pitfalls that widen the gap between existing XRL methods and these goals. The pitfalls include inaccessible reasoning processes, inconsistent or unintelligible explanations, and explanations that cannot be generalized. To fill the discovered gap, we propose a counterfactual-inference-based explanation method that discovers the details of the reasoning process of RL agents and generates natural language explanations. Surrounding this method, we build an interactive XRL system where users can actively explore explanations and influential information. In a user study with 14 participants, we validated that developers identified 20.9% more abnormal behaviors and limitations of RL agents with our system compared to the baseline method, and using our system helped end users improve their performance in actionability tests by 25.1% in an auto-driving task and by 16.9% in a StarCraft II micromanagement task.