Anymorph：通过推断代理形态来学习可转移的策略

论文标题

Anymorph：通过推断代理形态来学习可转移的策略

AnyMorph: Learning Transferable Polices By Inferring Agent Morphology

论文作者

Trabucco, Brandon, Phielipp, Mariano, Berseth, Glen

论文摘要

强化学习的典型方法涉及从划痕的培训政策，从头开始，从头开始。最近的工作旨在通过调查针对具有相似任务目标的各种代理的培训的形态 - 现象政策来消除政策的重新训练，可以将其转移给具有不看到形态的新代理商而不会重新训练。这是一个具有挑战性的问题，需要以前的方法使用手工设计的新代理人形态的描述。我们提出了一种数据驱动的方法，而不是手工设计此描述，该方法直接从增强学习目标中学习了形态的表示。我们的是第一个强化学习算法，它可以训练政策以推广到新的代理形态，而无需事先描述代理的形态。我们在标准基准中评估了对代理 - 不合稳定控制的方法，并以对新代理的零射门概括来改善当前最新技术状态。重要的是，我们的方法在没有明确描述形态的情况下达到了良好的性能。

The prototypical approach to reinforcement learning involves training policies tailored to a particular agent from scratch for every new morphology. Recent work aims to eliminate the re-training of policies by investigating whether a morphology-agnostic policy, trained on a diverse set of agents with similar task objectives, can be transferred to new agents with unseen morphologies without re-training. This is a challenging problem that required previous approaches to use hand-designed descriptions of the new agent's morphology. Instead of hand-designing this description, we propose a data-driven method that learns a representation of morphology directly from the reinforcement learning objective. Ours is the first reinforcement learning algorithm that can train a policy to generalize to new agent morphologies without requiring a description of the agent's morphology in advance. We evaluate our approach on the standard benchmark for agent-agnostic control, and improve over the current state of the art in zero-shot generalization to new agents. Importantly, our method attains good performance without an explicit description of morphology.

下载PDF全文

下载文献需遵守相关版权规定

论文标题