FedAvg进行微调：本地更新导致表示形式学习

论文标题

FedAvg进行微调：本地更新导致表示形式学习

FedAvg with Fine Tuning: Local Updates Lead to Representation Learning

论文作者

Collins, Liam, Hassani, Hamed, Mokhtari, Aryan, Shakkottai, Sanjay

论文摘要

联合的平均（FedAvg）算法包括在客户节点上的一些本地随机梯度更新之间交替，然后在服务器上进行平均模型平均更新，也许是联合学习中最常用的方法。尽管简单起见，但一些实证研究表明，在经过几个微调步骤之后，FedAvg的输出模型导致一个模型，可以很好地介绍新的看不见的任务。但是，从理论的角度来看，这种简单方法的令人惊讶的表现并未完全理解。在本文中，我们在多任务线性表示设置中正式研究了这种现象。我们表明，FedAvg输出的概括性背后的原因是它通过通过本地更新利用客户数据分布之间的多样性来学习客户任务之间的共同数据表示。我们正式建立了客户在基础共享表示为线性映射的环境中证明这种结果所需的迭代复杂性。据我们所知，这是任何设置的第一个结果。我们还提供了经验证据，证明了FedAvg在联合图像分类中具有异质数据的表示能力。

The Federated Averaging (FedAvg) algorithm, which consists of alternating between a few local stochastic gradient updates at client nodes, followed by a model averaging update at the server, is perhaps the most commonly used method in Federated Learning. Notwithstanding its simplicity, several empirical studies have illustrated that the output model of FedAvg, after a few fine-tuning steps, leads to a model that generalizes well to new unseen tasks. This surprising performance of such a simple method, however, is not fully understood from a theoretical point of view. In this paper, we formally investigate this phenomenon in the multi-task linear representation setting. We show that the reason behind generalizability of the FedAvg's output is its power in learning the common data representation among the clients' tasks, by leveraging the diversity among client data distributions via local updates. We formally establish the iteration complexity required by the clients for proving such result in the setting where the underlying shared representation is a linear map. To the best of our knowledge, this is the first such result for any setting. We also provide empirical evidence demonstrating FedAvg's representation learning ability in federated image classification with heterogeneous data.

下载PDF全文

下载文献需遵守相关版权规定

论文标题