TSTARBOT-X：一项开源和全面的研究，用于在Starcraft II完整比赛中进行有效的联赛培训

论文标题

TSTARBOT-X：一项开源和全面的研究，用于在Starcraft II完整比赛中进行有效的联赛培训

TStarBot-X: An Open-Sourced and Comprehensive Study for Efficient League Training in StarCraft II Full Game

论文作者

Han, Lei, Xiong, Jiechao, Sun, Peng, Sun, Xinghai, Fang, Meng, Guo, Qingwei, Chen, Qiaobo, Shi, Tengfei, Yu, Hongsheng, Wu, Xipeng, Zhang, Zhengyou

论文摘要

Starcraft是最困难的电子竞技游戏之一，具有长期的专业比赛历史，吸引了几代球员和粉丝，也吸引了人工智能研究中的强烈关注。最近，Google的DeepMind宣布了Alphastar，这是Starcraft II中的AI级AI，可以使用可比的动作空间和操作与人类一起玩。在本文中，我们介绍了一个新的AI代理，名为TSTARBOT-X，该代理在较少的计算订单下接受了培训，并且可以与专家人类玩家进行竞争。 TSTARBOT-X利用了字母中介绍的重要技术，还从包括新的联盟培训方法，新颖的多项式角色，规则指导的政策搜索，稳定的政策改进，轻量级的神经网络架构，轻量级的神经网络架构以及在模仿学习中的重要性等方面提出的忠诚度尺度的命令，从而表明该方法的依据，也可以从忠诚度尺度上提出，这也是如此，这也受益于计算量表对于确保TSTARBOT-X的竞争性能是必要的。我们揭示了与Alphastar中提到的有关的所有技术细节，显示了联盟训练中最敏感的部分，增强学习和模仿学习，影响了代理商的性能。最重要的是，这是一项开源研究，即所有代码和资源（包括训练有素的模型参数）均可通过\ url {https://github.com/tencent-ailab/tleab/tleaab/tleague_projpage}公开访问。我们希望这项研究可能对解决星际争霸等复杂问题的学术和工业未来的研究都有益，而且还可能为All Starcraft II参与者和其他AI代理商提供吵架伙伴。

StarCraft, one of the most difficult esport games with long-standing history of professional tournaments, has attracted generations of players and fans, and also, intense attentions in artificial intelligence research. Recently, Google's DeepMind announced AlphaStar, a grandmaster level AI in StarCraft II that can play with humans using comparable action space and operations. In this paper, we introduce a new AI agent, named TStarBot-X, that is trained under orders of less computations and can play competitively with expert human players. TStarBot-X takes advantage of important techniques introduced in AlphaStar, and also benefits from substantial innovations including new league training methods, novel multi-agent roles, rule-guided policy search, stabilized policy improvement, lightweight neural network architecture, and importance sampling in imitation learning, etc. We show that with orders of less computation scale, a faithful reimplementation of AlphaStar's methods can not succeed and the proposed techniques are necessary to ensure TStarBot-X's competitive performance. We reveal all technical details that are complementary to those mentioned in AlphaStar, showing the most sensitive parts in league training, reinforcement learning and imitation learning that affect the performance of the agents. Most importantly, this is an open-sourced study that all codes and resources (including the trained model parameters) are publicly accessible via \url{https://github.com/tencent-ailab/tleague_projpage}. We expect this study could be beneficial for both academic and industrial future research in solving complex problems like StarCraft, and also, might provide a sparring partner for all StarCraft II players and other AI agents.

下载PDF全文

下载文献需遵守相关版权规定

论文标题