求解零和单侧的部分可观察的随机游戏

论文标题

求解零和单侧的部分可观察的随机游戏

Solving Zero-Sum One-Sided Partially Observable Stochastic Games

论文作者

Horák, Karel, Bošanský, Branislav, Kovařík, Vojtěch, Kiekintveld, Christopher

论文摘要

许多安全性和其他现实世界的情况本质上是动态的，可以建模为严格竞争性（或零和零）动态游戏。在这些领域中，代理人会采取行动来影响环境，并就对手行动的情况和影响接收观察结果（可能是不完美的）。此外，代理可以执行的操作总数没有限制 - 也就是说，没有固定的视野。这些设置可以建模为可观察到的随机游戏（POSG）。但是，解决一般POSG在计算上是棘手的，因此我们专注于一个称为单方面POSG的POSG的广泛子类。在这些游戏中，只有一个代理商的对手对当前情况有充分的了解。我们提供了一个用于求解单方面POSG的完整图：我们（1）对单方面POSG及其价值功能进行了理论分析，（2）表明，值意式算法的变体在这种情况下会收敛，（3）适应启发式搜索值练习算法，以求解单方面的策略，以描述5个策略（4）使用该功能（4），（4）如何实现（4）范围（4），（4）在这种情况下（4）效果（4）效果（4），（4）在这种情况下（4）效果（4）范围（4）效果（4）证明我们的算法可以解决非平凡大小的单方面POSG，并在三个不同的领域中分析我们的算法的可扩展性：追捕，巡逻和搜索游戏。

Many security and other real-world situations are dynamic in nature and can be modelled as strictly competitive (or zero-sum) dynamic games. In these domains, agents perform actions to affect the environment and receive observations -- possibly imperfect -- about the situation and the effects of the opponent's actions. Moreover, there is no limitation on the total number of actions an agent can perform -- that is, there is no fixed horizon. These settings can be modelled as partially observable stochastic games (POSGs). However, solving general POSGs is computationally intractable, so we focus on a broad subclass of POSGs called one-sided POSGs. In these games, only one agent has imperfect information while their opponent has full knowledge of the current situation. We provide a full picture for solving one-sided POSGs: we (1) give a theoretical analysis of one-sided POSGs and their value functions, (2) show that a variant of a value-iteration algorithm converges in this setting, (3) adapt the heuristic search value-iteration algorithm for solving one-sided POSGs, (4) describe how to use approximate value functions to derive strategies in the game, and (5) demonstrate that our algorithm can solve one-sided POSGs of non-trivial sizes and analyze the scalability of our algorithm in three different domains: pursuit-evasion, patrolling, and search games.

下载PDF全文

下载文献需遵守相关版权规定

论文标题