通过多样化数据来源来增强车辆网络中的分散联合学习

论文标题

通过多样化数据来源来增强车辆网络中的分散联合学习

Boost Decentralized Federated Learning in Vehicular Networks by Diversifying Data Sources

论文作者

Su, Dongyuan, Zhou, Yipeng, Cui, Laizhong

论文摘要

最近，联邦学习（FL）获得了深入的研究，因为它具有为分散客户提供协作训练机器学习模型的数据隐私的能力。通常，部署了参数服务器（PS）来汇总不同客户端贡献的模型参数。分散的联合学习（DFL）是从FL升级的，这使客户可以直接与邻居汇总模型参数。当车辆以车辆对车辆（V2V）方式相互通信时，DFL对于车辆网络特别可行。但是，由于对车辆路线和通信距离的限制，单个车辆很难与他人充分交流模型。促成单个车辆模型的数据源可能没有足够多样化，从而导致模型准确性差。为了解决这个问题，我们提出了DFL-DDS（带有多元化数据源）算法的DFL-DDS，以使DFL中的数据源多样化。具体而言，每辆车都保持状态向量以记录每个数据源对其模型的贡献权重。采用Kullback-Leibler（KL）差异来衡量国家向量的多样性。为了提高DFL的收敛性，车辆通过最大程度地降低其状态向量的KL差异来调整每个数据源的聚合权重，并且可以在理论上证明其在多元化数据源中的有效性。最后，通过广泛的实验（使用MNIST和CIFAR-10数据集）评估DFL-DDS的优势，这些实验表明DFL-DDS可以加速DFL的收敛性并显着提高模型准确性，并显着提高模型的准确性。

Recently, federated learning (FL) has received intensive research because of its ability in preserving data privacy for scattered clients to collaboratively train machine learning models. Commonly, a parameter server (PS) is deployed for aggregating model parameters contributed by different clients. Decentralized federated learning (DFL) is upgraded from FL which allows clients to aggregate model parameters with their neighbours directly. DFL is particularly feasible for vehicular networks as vehicles communicate with each other in a vehicle-to-vehicle (V2V) manner. However, due to the restrictions of vehicle routes and communication distances, it is hard for individual vehicles to sufficiently exchange models with others. Data sources contributing to models on individual vehicles may not diversified enough resulting in poor model accuracy. To address this problem, we propose the DFL-DDS (DFL with diversified Data Sources) algorithm to diversify data sources in DFL. Specifically, each vehicle maintains a state vector to record the contribution weight of each data source to its model. The Kullback-Leibler (KL) divergence is adopted to measure the diversity of a state vector. To boost the convergence of DFL, a vehicle tunes the aggregation weight of each data source by minimizing the KL divergence of its state vector, and its effectiveness in diversifying data sources can be theoretically proved. Finally, the superiority of DFL-DDS is evaluated by extensive experiments (with MNIST and CIFAR-10 datasets) which demonstrate that DFL-DDS can accelerate the convergence of DFL and improve the model accuracy significantly compared with state-of-the-art baselines.

下载PDF全文

下载文献需遵守相关版权规定

论文标题