可解释的强化学习：一项调查

论文标题

可解释的强化学习：一项调查

Explainable Reinforcement Learning: A Survey

论文作者

Puiutta, Erika, Veith, Eric MSP

论文摘要

在过去的几年中，可解释的人工智能（XAI），即开发更透明和可解释的AI模型，在过去几年中获得了更多的吸引力。这是由于以下事实：AI模型结合其成长为强大且普遍存在的工具，具有一个不利的特征：性能与透明度的权衡。这描述了一个事实，即模型的内部运作越复杂，它越清楚地是如何实现其预测或决策。但是，尤其是考虑机器学习（ML）方法，例如系统自主学习的加固学习（RL），了解其决策的基本推理的必要性变得很明显。据我们所知，没有任何一项工作提供可解释的强化学习（XRL）方法的概述，因此这项调查试图解决这一差距。我们简要介绍了问题的定义，并提供了当前XRL方法的分类和评估。我们发现a）大多数XRL方法通过模仿和简化复杂的模型而不是设计固有的简单模型来发挥作用，而b）XRL（和XAI）方法通常忽略了考虑方程式的人体方程，而不是考虑了从心理学或哲学等相关领域的研究。因此，需要进行跨学科的工作，以使生成的解释适应（非专家）人类用户，以便在XRL和XAI领域有效进步。

Explainable Artificial Intelligence (XAI), i.e., the development of more transparent and interpretable AI models, has gained increased traction over the last few years. This is due to the fact that, in conjunction with their growth into powerful and ubiquitous tools, AI models exhibit one detrimential characteristic: a performance-transparency trade-off. This describes the fact that the more complex a model's inner workings, the less clear it is how its predictions or decisions were achieved. But, especially considering Machine Learning (ML) methods like Reinforcement Learning (RL) where the system learns autonomously, the necessity to understand the underlying reasoning for their decisions becomes apparent. Since, to the best of our knowledge, there exists no single work offering an overview of Explainable Reinforcement Learning (XRL) methods, this survey attempts to address this gap. We give a short summary of the problem, a definition of important terms, and offer a classification and assessment of current XRL methods. We found that a) the majority of XRL methods function by mimicking and simplifying a complex model instead of designing an inherently simple one, and b) XRL (and XAI) methods often neglect to consider the human side of the equation, not taking into account research from related fields like psychology or philosophy. Thus, an interdisciplinary effort is needed to adapt the generated explanations to a (non-expert) human user in order to effectively progress in the field of XRL and XAI in general.

下载PDF全文

下载文献需遵守相关版权规定

论文标题