基于滞后的RL：通过混合控制来鲁棒性增强学习政策

论文标题

基于滞后的RL：通过混合控制来鲁棒性增强学习政策

Hysteresis-Based RL: Robustifying Reinforcement Learning-based Control Policies via Hybrid Control

论文作者

de Priester, Jan, Sanfelice, Ricardo G., van de Wouw, Nathan

论文摘要

增强学习（RL）是为复杂系统提供控制策略的有前途的方法。正如我们在两个控制问题中所显示的那样，使用近端策略优化（PPO）和深Q-NETWORK（DQN）算法的派生策略可能缺乏稳定性的保证。在这些问题的推动下，我们提出了一种新的混合算法，我们称之为基于滞后的RL（HYRL），从而增强了具有磁滞切换和两个学习阶段的现有RL算法。我们在PPO和DQN失败的两个示例中说明了其属性。

Reinforcement learning (RL) is a promising approach for deriving control policies for complex systems. As we show in two control problems, the derived policies from using the Proximal Policy Optimization (PPO) and Deep Q-Network (DQN) algorithms may lack robustness guarantees. Motivated by these issues, we propose a new hybrid algorithm, which we call Hysteresis-Based RL (HyRL), augmenting an existing RL algorithm with hysteresis switching and two stages of learning. We illustrate its properties in two examples for which PPO and DQN fail.

下载PDF全文

下载文献需遵守相关版权规定

论文标题