论文标题

数据驱动的动态多目标最佳控制:一种吸引满意的增强学习方法

Data-driven Dynamic Multi-objective Optimal Control: An Aspiration-satisfying Reinforcement Learning Approach

论文作者

Mazouchi, Majid, Yang, Yongliang, Modares, Hamidreza

论文摘要

本文提出了一种迭代数据驱动的算法,用于求解动态多目标(MO)最佳控制问题,该问题在控制非线性连续时间系统时产生。首先表明,可以利用与每个目标相对应的哈密顿功能,以比较可接受的策略的性能。然后,使用哈密顿的知识来确保满足目标的愿望。然后提出一个令人满意的动态优化框架,以优化主要目标,同时满足其他目标的愿望。显示了与满足(足够好的)决策框架的关系。开发了基于平方的(SOS)的迭代算法来解决配制的吸气式MO优化。为了消除对系统动力学的完整知识的要求,提出了一种数据驱动的满足加强学习方法,以实时解决SOS优化问题,仅使用在时间间隔内测量的系统轨迹的信息,而无需完全了解系统动力学。最后,提供了两个模拟示例,以显示所提出的算法的有效性。

This paper presents an iterative data-driven algorithm for solving dynamic multi-objective (MO) optimal control problems arising in control of nonlinear continuous-time systems. It is first shown that the Hamiltonian functional corresponding to each objective can be leveraged to compare the performance of admissible policies. Hamiltonian-inequalities are then used for which their satisfaction guarantees satisfying the objectives' aspirations. An aspiration-satisfying dynamic optimization framework is then presented to optimize the main objective while satisfying the aspiration of other objectives. Relation to satisficing (good enough) decision-making framework is shown. A Sum-of-Square (SOS) based iterative algorithm is developed to solve the formulated aspiration-satisfying MO optimization. To obviate the requirement of complete knowledge of the system dynamics, a data-driven satisficing reinforcement learning approach is proposed to solve the SOS optimization problem in real-time using only the information of the system trajectories measured during a time interval without having full knowledge of the system dynamics. Finally, two simulation examples are provided to show the effectiveness of the proposed algorithm.

扫码加入交流群

加入微信交流群

微信交流群二维码

扫码加入学术交流群,获取更多资源