剩余的动力总成控制政策学习

论文标题

剩余的动力总成控制政策学习

Residual Policy Learning for Powertrain Control

论文作者

Kerbel, Lindsey, Ayalew, Beshah, Ivanco, Andrej, Loiselle, Keith

论文摘要

已显示生态驾驶策略可大大减少燃料消耗。本文概述了使用剩余政策学习（RPL）代理的主动驾驶员辅助方法，该方法训练有培训，可以为默认的动力训练控制器提供剩余的动作，同时平衡燃油消耗与其他驾驶员 - 住宿目标。利用以前的经验，我们的RPL代理学习了改进的牵引扭矩和齿轮移动剩余政策，以使动力总成的运行适应环境中的变化和不确定性。为了进行比较，我们考虑了从头开始训练的传统加固学习（RL）代理。两种代理商都采用非政策策略优化算法，具有与演员批判性架构。通过在各种遵循汽车的方案中实施模拟商用车，我们发现RPL代理与基线源政策相比，迅速学到了显着改善的政策，但在某些措施中，随着RL从Scratch培训的RL代理商最终可能不错。

Eco-driving strategies have been shown to provide significant reductions in fuel consumption. This paper outlines an active driver assistance approach that uses a residual policy learning (RPL) agent trained to provide residual actions to default power train controllers while balancing fuel consumption against other driver-accommodation objectives. Using previous experiences, our RPL agent learns improved traction torque and gear shifting residual policies to adapt the operation of the powertrain to variations and uncertainties in the environment. For comparison, we consider a traditional reinforcement learning (RL) agent trained from scratch. Both agents employ the off-policy Maximum A Posteriori Policy Optimization algorithm with an actor-critic architecture. By implementing on a simulated commercial vehicle in various car-following scenarios, we find that the RPL agent quickly learns significantly improved policies compared to a baseline source policy but in some measures not as good as those eventually possible with the RL agent trained from scratch.

下载PDF全文

下载文献需遵守相关版权规定

论文标题