开关线性控制系统的学习预期奖励：非质合视图

论文标题

开关线性控制系统的学习预期奖励：非质合视图

Learning Expected Reward for Switched Linear Control Systems: A Non-Asymptotic View

论文作者

Naeem, Muhammad Abdullah, Pajic, Miroslav

论文摘要

在这项工作中，我们显示了在$ \ Mathbb {r}^{n} $的某个无界子集中，在系统动力学的规范稳定性假设下，开关线性动力学系统（SLDSS）的不变性段落度量的存在。因此，鉴于固定的马尔可夫控制政策，我们使用伯克霍夫（Birkhoff）的ergodic定理来得出了学习预期奖励（W.R.T不变的ergodic措施我们的闭环系统混合的不变的ergodic措施）的非反应界限。提出的结果为得出非反应分析的基础是基于平均奖励的SLDSS最佳控制的基础。最后，我们在两个案例研究中说明了所提出的理论结果。

In this work, we show existence of invariant ergodic measure for switched linear dynamical systems (SLDSs) under a norm-stability assumption of system dynamics in some unbounded subset of $\mathbb{R}^{n}$. Consequently, given a stationary Markov control policy, we derive non-asymptotic bounds for learning expected reward (w.r.t the invariant ergodic measure our closed-loop system mixes to) from time-averages using Birkhoff's Ergodic Theorem. The presented results provide a foundation for deriving non-asymptotic analysis for average reward-based optimal control of SLDSs. Finally, we illustrate the presented theoretical results in two case-studies.

下载PDF全文

下载文献需遵守相关版权规定

论文标题