神经-ILQR：一种学习辅助的轨迹优化的射击方法

论文标题

神经-ILQR：一种学习辅助的轨迹优化的射击方法

Neural-iLQR: A Learning-Aided Shooting Method for Trajectory Optimization

论文作者

Cheng, Zilong, Li, Yulin, Chen, Kai, Ma, Jun, Lee, Tong Heng

论文摘要

迭代线性二次调节器（ILQR）在解决非线性系统模型的轨迹优化问题方面已广泛普及。但是，作为一种基于模型的拍摄方法，它在很大程度上依赖于准确的系统模型来更新最佳控制动作和通过正向集成确定的轨迹，从而变得容易受到不可避免的模型的影响。最近，针对最佳控制问题的基于学习方法的大量研究工作在解决未知系统模型方面已经取得了显着发展，尤其是当系统与环境具有复杂的相互作用时。然而，通常需要一个深层的神经网络来拟合大量的采样数据。在这项工作中，我们介绍了神经ILQR，这是一种在不受约束的控制空间上进行学习的拍摄方法，其中使用简单结构的神经网络代表局部系统模型。在此框架中，通过同时完善最佳策略和神经网络迭代的轨迹优化任务，而无需依赖系统模型的先验知识。通过对两项说明性控制任务的全面评估，在系统模型中存在不准确的情况下，该提出的方法显示出胜过常规ILQR的表现。

Iterative linear quadratic regulator (iLQR) has gained wide popularity in addressing trajectory optimization problems with nonlinear system models. However, as a model-based shooting method, it relies heavily on an accurate system model to update the optimal control actions and the trajectory determined with forward integration, thus becoming vulnerable to inevitable model inaccuracies. Recently, substantial research efforts in learning-based methods for optimal control problems have been progressing significantly in addressing unknown system models, particularly when the system has complex interactions with the environment. Yet a deep neural network is normally required to fit substantial scale of sampling data. In this work, we present Neural-iLQR, a learning-aided shooting method over the unconstrained control space, in which a neural network with a simple structure is used to represent the local system model. In this framework, the trajectory optimization task is achieved with simultaneous refinement of the optimal policy and the neural network iteratively, without relying on the prior knowledge of the system model. Through comprehensive evaluations on two illustrative control tasks, the proposed method is shown to outperform the conventional iLQR significantly in the presence of inaccuracies in system models.

下载PDF全文

下载文献需遵守相关版权规定

论文标题