论文标题

开关线性控制系统的学习预期奖励:非质合视图

Learning Expected Reward for Switched Linear Control Systems: A Non-Asymptotic View

论文作者

Naeem, Muhammad Abdullah, Pajic, Miroslav

论文摘要

在这项工作中,我们显示了在$ \ Mathbb {r}^{n} $的某个无界子集中,在系统动力学的规范稳定性假设下,开关线性动力学系统(SLDSS)的不变性段落度量的存在。因此,鉴于固定的马尔可夫控制政策,我们使用伯克霍夫(Birkhoff)的ergodic定理来得出了学习预期奖励(W.R.T不变的ergodic措施我们的闭环系统混合的不变的ergodic措施)的非反应界限。提出的结果为得出非反应分析的基础是基于平均奖励的SLDSS最佳控制的基础。最后,我们在两个案例研究中说明了所提出的理论结果。

In this work, we show existence of invariant ergodic measure for switched linear dynamical systems (SLDSs) under a norm-stability assumption of system dynamics in some unbounded subset of $\mathbb{R}^{n}$. Consequently, given a stationary Markov control policy, we derive non-asymptotic bounds for learning expected reward (w.r.t the invariant ergodic measure our closed-loop system mixes to) from time-averages using Birkhoff's Ergodic Theorem. The presented results provide a foundation for deriving non-asymptotic analysis for average reward-based optimal control of SLDSs. Finally, we illustrate the presented theoretical results in two case-studies.

扫码加入交流群

加入微信交流群

微信交流群二维码

扫码加入学术交流群,获取更多资源