使用动态相似性的经验选择，以在机器人之间进行有效的多源传输学习

论文标题

使用动态相似性的经验选择，以在机器人之间进行有效的多源传输学习

Experience Selection Using Dynamics Similarity for Efficient Multi-Source Transfer Learning Between Robots

论文作者

Sorocky, Michael J., Zhou, Siqi, Schoellig, Angela P.

论文摘要

在机器人文献中，已经提出了不同的知识转移方法，以利用源任务或机器人（真实或虚拟）的经验来加速新任务或机器人的学习过程。一个常见但很少检查的假设是，从源任务或机器人中纳入经验将是有益的。实际上，不当知识转移会导致负面转移或不安全行为。在这项工作中，受到$ν$ -GAP的系统差距度量的启发，我们提出了一种数据效率的算法，用于估计机器人系统对之间的相似性。在多源机器人间传输学习设置中，我们表明，这种相似性度量使我们能够预测相对转移性能，从而在知识传输之前从源机器人中选择了丰富的体验。我们通过四型实验演示了我们的方法，在该实验中，我们将反向动力学模型从真实或虚拟源四极管转移，以增强在任意手绘轨迹上的目标四轨的跟踪性能。我们表明，基于提议的相似性度量的选择经验有效地促进了目标四极管的学习，与选择不佳的经验相比，绩效提高了62％。

In the robotics literature, different knowledge transfer approaches have been proposed to leverage the experience from a source task or robot -- real or virtual -- to accelerate the learning process on a new task or robot. A commonly made but infrequently examined assumption is that incorporating experience from a source task or robot will be beneficial. In practice, inappropriate knowledge transfer can result in negative transfer or unsafe behaviour. In this work, inspired by a system gap metric from robust control theory, the $ν$-gap, we present a data-efficient algorithm for estimating the similarity between pairs of robot systems. In a multi-source inter-robot transfer learning setup, we show that this similarity metric allows us to predict relative transfer performance and thus informatively select experiences from a source robot before knowledge transfer. We demonstrate our approach with quadrotor experiments, where we transfer an inverse dynamics model from a real or virtual source quadrotor to enhance the tracking performance of a target quadrotor on arbitrary hand-drawn trajectories. We show that selecting experiences based on the proposed similarity metric effectively facilitates the learning of the target quadrotor, improving performance by 62% compared to a poorly selected experience.

下载PDF全文

下载文献需遵守相关版权规定

论文标题