扩展径向基函数控制器用于增强学习

论文标题

扩展径向基函数控制器用于增强学习

Extended Radial Basis Function Controller for Reinforcement Learning

论文作者

Capel, Nicholas, Zhang, Naifu

论文摘要

已经尝试强化学习利用有关系统结构的先验知识。本文提出了一个混合增强学习控制器，该控制器会动态地插入基于模型的线性控制器和任意可区分的策略。线性控制器的设计基于局部线性化模型知识，并稳定在附近的操作点的系统。两个控制器之间的插值系数由测量当前状态和操作点之间的距离的缩放距离函数确定。事实证明，整体混合控制器可以维持操作点附近的稳定性保证，并且仍然具有任意非线性政策的通用函数近似属性。已经对基于模型的（PILCO）和无模型（DDPG）框架进行了学习。在OpenAI体育馆进行的仿真实验表明了拟议的混合动力控制器的稳定性和鲁棒性。因此，本文引入了一种原则性方法，允许将控制方法直接导入加强学习。

There have been attempts in reinforcement learning to exploit a priori knowledge about the structure of the system. This paper proposes a hybrid reinforcement learning controller which dynamically interpolates a model-based linear controller and an arbitrary differentiable policy. The linear controller is designed based on local linearised model knowledge, and stabilises the system in a neighbourhood about an operating point. The coefficients of interpolation between the two controllers are determined by a scaled distance function measuring the distance between the current state and the operating point. The overall hybrid controller is proven to maintain the stability guarantee around the neighborhood of the operating point and still possess the universal function approximation property of the arbitrary non-linear policy. Learning has been done on both model-based (PILCO) and model-free (DDPG) frameworks. Simulation experiments performed in OpenAI gym demonstrate stability and robustness of the proposed hybrid controller. This paper thus introduces a principled method allowing for the direct importing of control methodology into reinforcement learning.

下载PDF全文

下载文献需遵守相关版权规定

论文标题