无线系统中网络密度控制的分散平行随机梯度下降

论文标题

无线系统中网络密度控制的分散平行随机梯度下降

Network-Density-Controlled Decentralized Parallel Stochastic Gradient Descent in Wireless Systems

论文作者

Sato, Koya, Satoh, Yasuyuki, Sugimura, Daisuke

论文摘要

本文提出了一种在无线系统上分散学习的通信策略。我们的讨论基于分散的平行随机梯度下降（D-PSGD），这是分散学习的最新算法之一。本文的主要贡献是为无线系统分散学习提出一个新的开放问题：网络拓扑的密度可能会显着影响D-PSGD的运行时性能。通常，由于路径丢失和多路褪色，很难保证在实际无线网络系统中没有任何通信恶化的情况下保证无延迟通信。这些因素大大降低了D-PSGD的运行时性能。为了减轻此类问题，我们首先通过考虑真实的无线系统来分析D-PSGD的运行时性能。该分析得出的关键见解是，密集的网络拓扑（1）与稀疏一个人相比没有显着获得D-PSGD的训练准确性，并且（2）强烈降低了运行时性能，因为此设置通常需要使用低速传输。基于这些发现，我们提出了一种新颖的通信策略，在该策略中，每个节点估计最佳传输速率，以便在d-psGD优化期间的通信时间在网络密度的约束下最小化，这是无线电传播属性的特征。提出的策略使无线系统中D-PSGD的运行时性能提高了。数值模拟表明，所提出的策略能够增强D-PSGD的运行时性能。

This paper proposes a communication strategy for decentralized learning on wireless systems. Our discussion is based on the decentralized parallel stochastic gradient descent (D-PSGD), which is one of the state-of-the-art algorithms for decentralized learning. The main contribution of this paper is to raise a novel open question for decentralized learning on wireless systems: there is a possibility that the density of a network topology significantly influences the runtime performance of D-PSGD. In general, it is difficult to guarantee delay-free communications without any communication deterioration in real wireless network systems because of path loss and multi-path fading. These factors significantly degrade the runtime performance of D-PSGD. To alleviate such problems, we first analyze the runtime performance of D-PSGD by considering real wireless systems. This analysis yields the key insights that dense network topology (1) does not significantly gain the training accuracy of D-PSGD compared to sparse one, and (2) strongly degrades the runtime performance because this setting generally requires to utilize a low-rate transmission. Based on these findings, we propose a novel communication strategy, in which each node estimates optimal transmission rates such that communication time during the D-PSGD optimization is minimized under the constraint of network density, which is characterized by radio propagation property. The proposed strategy enables to improve the runtime performance of D-PSGD in wireless systems. Numerical simulations reveal that the proposed strategy is capable of enhancing the runtime performance of D-PSGD.

下载PDF全文

下载文献需遵守相关版权规定

论文标题