论文标题
基于滞后的RL:通过混合控制来鲁棒性增强学习政策
Hysteresis-Based RL: Robustifying Reinforcement Learning-based Control Policies via Hybrid Control
论文作者
论文摘要
增强学习(RL)是为复杂系统提供控制策略的有前途的方法。正如我们在两个控制问题中所显示的那样,使用近端策略优化(PPO)和深Q-NETWORK(DQN)算法的派生策略可能缺乏稳定性的保证。在这些问题的推动下,我们提出了一种新的混合算法,我们称之为基于滞后的RL(HYRL),从而增强了具有磁滞切换和两个学习阶段的现有RL算法。我们在PPO和DQN失败的两个示例中说明了其属性。
Reinforcement learning (RL) is a promising approach for deriving control policies for complex systems. As we show in two control problems, the derived policies from using the Proximal Policy Optimization (PPO) and Deep Q-Network (DQN) algorithms may lack robustness guarantees. Motivated by these issues, we propose a new hybrid algorithm, which we call Hysteresis-Based RL (HyRL), augmenting an existing RL algorithm with hysteresis switching and two stages of learning. We illustrate its properties in two examples for which PPO and DQN fail.