Autoeg：自动化的经验嫁接以进行非政策深度加固学习

论文标题

Autoeg：自动化的经验嫁接以进行非政策深度加固学习

AutoEG: Automated Experience Grafting for Off-Policy Deep Reinforcement Learning

论文作者

Lu, Keting, Zhang, Shiqi, Chen, Xiaoping

论文摘要

深度加强学习（RL）算法经常需要过度的相互作用经验，以确保学会的政策质量。限制部分是因为代理在早期学习阶段的许多低质量试验中无法学到太多，这导致了低学习率。为了解决这一限制，本文做出了双重贡献。首先，我们开发了一种称为“经验嫁接”（例如）的算法，以使RL代理能够重组从体验池中的少数高质量轨迹的段，以产生许多合成轨迹，同时保留质量。其次，基于EG，我们进一步开发了一种自动代理，该代理会自动学习调整基于嫁接的学习策略。从一组六个机器人控制环境中收集的结果表明，与标准的深度RL算法（DDPG）相比，自动赛车将学习过程的速度提高至少30％。

Deep reinforcement learning (RL) algorithms frequently require prohibitive interaction experience to ensure the quality of learned policies. The limitation is partly because the agent cannot learn much from the many low-quality trials in early learning phase, which results in low learning rate. Focusing on addressing this limitation, this paper makes a twofold contribution. First, we develop an algorithm, called Experience Grafting (EG), to enable RL agents to reorganize segments of the few high-quality trajectories from the experience pool to generate many synthetic trajectories while retaining the quality. Second, building on EG, we further develop an AutoEG agent that automatically learns to adjust the grafting-based learning strategy. Results collected from a set of six robotic control environments show that, in comparison to a standard deep RL algorithm (DDPG), AutoEG increases the speed of learning process by at least 30%.

下载PDF全文

下载文献需遵守相关版权规定

论文标题