在联邦学习中培训大型模型的异质合奏知识转移

论文标题

在联邦学习中培训大型模型的异质合奏知识转移

Heterogeneous Ensemble Knowledge Transfer for Training Large Models in Federated Learning

论文作者

Cho, Yae Jee, Manoel, Andre, Joshi, Gauri, Sim, Robert, Dimitriadis, Dimitrios

论文摘要

联合学习（FL）使边缘设备可以协作学习模型，而无需向中央聚合服务器披露其私人数据。大多数现有的FL算法都需要在客户和服务器上部署相同的体系结构模型，这使得由于客户端的系统资源有限，因此训练大型模型是不可行的。在这项工作中，我们提出了一种名为fed-et的新型集成知识转移方法，其中小型模型（不同的体系结构）是对客户端进行培训的，并用于在服务器上培训更大的模型。与传统的合奏学习不同，在FL中，合奏可以接受客户高度异构数据的培训。认识到这一特性，Fed-ET使用具有多样性正则化的加权共识方案，可以有效地从集合中提取可靠的共识，同时通过利用整体内的多样性来改善概括。我们显示了在支持Fed-ET直觉的异质数据集上训练的加权模型集合的概括。我们在图像和语言任务上的实验表明，Fed-Et明显优于其他最先进的FL算法，而通信参数较少，并且在高数据杂种性方面也很强。

Federated learning (FL) enables edge-devices to collaboratively learn a model without disclosing their private data to a central aggregating server. Most existing FL algorithms require models of identical architecture to be deployed across the clients and server, making it infeasible to train large models due to clients' limited system resources. In this work, we propose a novel ensemble knowledge transfer method named Fed-ET in which small models (different in architecture) are trained on clients, and used to train a larger model at the server. Unlike in conventional ensemble learning, in FL the ensemble can be trained on clients' highly heterogeneous data. Cognizant of this property, Fed-ET uses a weighted consensus distillation scheme with diversity regularization that efficiently extracts reliable consensus from the ensemble while improving generalization by exploiting the diversity within the ensemble. We show the generalization bound for the ensemble of weighted models trained on heterogeneous datasets that supports the intuition of Fed-ET. Our experiments on image and language tasks show that Fed-ET significantly outperforms other state-of-the-art FL algorithms with fewer communicated parameters, and is also robust against high data-heterogeneity.

下载PDF全文

下载文献需遵守相关版权规定

论文标题