多类型平均田间加固学习

论文标题

多类型平均田间加固学习

Multi Type Mean Field Reinforcement Learning

论文作者

Subramanian, Sriram Ganapathi, Poupart, Pascal, Taylor, Matthew E., Hegde, Nidhi

论文摘要

平均场理论提供了一种将多基强化学习算法扩展到许多代理可以由虚拟均值代理提取的环境的有效方法。在本文中，我们将平均字段多基因算法扩展到多种类型。这种类型使在平均场增强学习中放松核心假设，即环境中的所有代理都在采用几乎相似的策略，并且具有相同的目标。我们基于标准的大报框架，对许多代理增强学习领域的三个不同测试床进行实验。我们考虑两种不同类型的平均场环境：a）代理属于预定义类型的游戏，这些类型是先验和b）每个代理类型未知的游戏，因此必须根据观察值学习。我们为每种类型的游戏介绍了新的算法，并演示了它们优于最先进的算法，这些算法假定所有代理都属于Magent Framework中所有代理商属于相同类型和其他基线算法。

Mean field theory provides an effective way of scaling multiagent reinforcement learning algorithms to environments with many agents that can be abstracted by a virtual mean agent. In this paper, we extend mean field multiagent algorithms to multiple types. The types enable the relaxation of a core assumption in mean field reinforcement learning, which is that all agents in the environment are playing almost similar strategies and have the same goal. We conduct experiments on three different testbeds for the field of many agent reinforcement learning, based on the standard MAgents framework. We consider two different kinds of mean field environments: a) Games where agents belong to predefined types that are known a priori and b) Games where the type of each agent is unknown and therefore must be learned based on observations. We introduce new algorithms for each type of game and demonstrate their superior performance over state of the art algorithms that assume that all agents belong to the same type and other baseline algorithms in the MAgent framework.

下载PDF全文

下载文献需遵守相关版权规定

论文标题