论文标题
开关线性控制系统的学习预期奖励:非质合视图
Learning Expected Reward for Switched Linear Control Systems: A Non-Asymptotic View
论文作者
论文摘要
在这项工作中,我们显示了在$ \ Mathbb {r}^{n} $的某个无界子集中,在系统动力学的规范稳定性假设下,开关线性动力学系统(SLDSS)的不变性段落度量的存在。因此,鉴于固定的马尔可夫控制政策,我们使用伯克霍夫(Birkhoff)的ergodic定理来得出了学习预期奖励(W.R.T不变的ergodic措施我们的闭环系统混合的不变的ergodic措施)的非反应界限。提出的结果为得出非反应分析的基础是基于平均奖励的SLDSS最佳控制的基础。最后,我们在两个案例研究中说明了所提出的理论结果。
In this work, we show existence of invariant ergodic measure for switched linear dynamical systems (SLDSs) under a norm-stability assumption of system dynamics in some unbounded subset of $\mathbb{R}^{n}$. Consequently, given a stationary Markov control policy, we derive non-asymptotic bounds for learning expected reward (w.r.t the invariant ergodic measure our closed-loop system mixes to) from time-averages using Birkhoff's Ergodic Theorem. The presented results provide a foundation for deriving non-asymptotic analysis for average reward-based optimal control of SLDSs. Finally, we illustrate the presented theoretical results in two case-studies.