可扩展的计划和学习框架开发群体互动问题

论文标题

可扩展的计划和学习框架开发群体互动问题

Scalable Planning and Learning Framework Development for Swarm-to-Swarm Engagement Problems

论文作者

Demir, Umut, Satir, A. Sadik, Sever, Gulay Goktas, Yikilmaz, Cansu, Ure, Nazim Kemal

论文摘要

近年来，群体的指导，导航和控制框架/算法引起了极大的关注。话虽如此，计划群体分配/轨迹与敌人群体互动的算法在很大程度上是一个研究的问题。尽管可以使用差异游戏理论中的工具来解决小规模的方案，但现有方法无法扩展大规模的多代理追求逃避（PE）方案。在这项工作中，我们提出了一个基于增强的学习（RL）框架，将大规模的群互动问题分解为许多独立的多项式追求逃避游戏。我们模拟了各种多代理PE方案，在某些条件下可以保证有限的时间捕获。计算出的PE统计信息作为向高级别分配层的奖励信号提供，该层使用RL算法来分配受控的群体单位，以最高的效率消除敌方群体。我们在大规模群体互动模拟中验证我们的方法。

Development of guidance, navigation and control frameworks/algorithms for swarms attracted significant attention in recent years. That being said, algorithms for planning swarm allocations/trajectories for engaging with enemy swarms is largely an understudied problem. Although small-scale scenarios can be addressed with tools from differential game theory, existing approaches fail to scale for large-scale multi-agent pursuit evasion (PE) scenarios. In this work, we propose a reinforcement learning (RL) based framework to decompose to large-scale swarm engagement problems into a number of independent multi-agent pursuit-evasion games. We simulate a variety of multi-agent PE scenarios, where finite time capture is guaranteed under certain conditions. The calculated PE statistics are provided as a reward signal to the high level allocation layer, which uses an RL algorithm to allocate controlled swarm units to eliminate enemy swarm units with maximum efficiency. We verify our approach in large-scale swarm-to-swarm engagement simulations.

下载PDF全文

下载文献需遵守相关版权规定

论文标题