CT-DQN：对照的深入增强学习

论文标题

CT-DQN：对照的深入增强学习

CT-DQN: Control-Tutored Deep Reinforcement Learning

论文作者

De Lellis, Francesco, Coraggio, Marco, Russo, Giovanni, Musolesi, Mirco, di Bernardo, Mario

论文摘要

深度强化学习的主要挑战之一是需要进行广泛的培训来学习该政策。在此激励的基础上，我们介绍了对控制的深层Q-Networks（CT-DQN）算法的设计，这是一种深入的增强学习算法，该算法利用控制教师，即外源性控制法，以减少学习时间。可以使用系统的近似模型来设计导师，而无需对系统动力学知识的任何假设。没有期望，如果使用独立的话，它将能够实现控制目标。在学习过程中，导师偶尔会提出一项行动，从而部分指导探索。我们在Openai体育馆的三种情况下验证了我们的方法：倒立的摆，月球兰德和赛车。我们证明CT-DQN能够相对于经典函数近似解决方案实现更好或同等的数据效率。

One of the major challenges in Deep Reinforcement Learning for control is the need for extensive training to learn the policy. Motivated by this, we present the design of the Control-Tutored Deep Q-Networks (CT-DQN) algorithm, a Deep Reinforcement Learning algorithm that leverages a control tutor, i.e., an exogenous control law, to reduce learning time. The tutor can be designed using an approximate model of the system, without any assumption about the knowledge of the system's dynamics. There is no expectation that it will be able to achieve the control objective if used stand-alone. During learning, the tutor occasionally suggests an action, thus partially guiding exploration. We validate our approach on three scenarios from OpenAI Gym: the inverted pendulum, lunar lander, and car racing. We demonstrate that CT-DQN is able to achieve better or equivalent data efficiency with respect to the classic function approximation solutions.

下载PDF全文

下载文献需遵守相关版权规定

论文标题