非凸复合分散优化的近端随机递归动量方法

论文标题

非凸复合分散优化的近端随机递归动量方法

Proximal Stochastic Recursive Momentum Methods for Nonconvex Composite Decentralized Optimization

论文作者

Mancino-Ball, Gabriel, Miao, Shengnan, Xu, Yangyang, Chen, Jie

论文摘要

考虑一个$ n $分散计算代理的网络协作解决非convex随机复合问题。在这项工作中，我们提出了一种称为Deepstorm的单环算法，该算法可在此设置中达到最佳样品复杂性。与需要大批量大小的双环算法偶尔计算（随机）梯度的算法不同，DeepStorm使用少量批量大小，在流媒体数据和在线学习等有时创造优势。这是第一种用于分散的非covex随机复合问题的最佳样本复杂性的方法，需要$ \ MATHCAL {O}（1）$批处理大小。我们对恒定和减小的步进大小进行深度策略进行收敛分析。此外，在适当的初始化和足够小的所需解决方案误差下，我们表明，具有恒定步长的Deepstorm可实现与网络无关的样品复杂性，并且相对于集中式方法而言，相对于$ n $的额外线性加速。所有代码均可在https://github.com/gmancino/deepstorm上提供。

Consider a network of $N$ decentralized computing agents collaboratively solving a nonconvex stochastic composite problem. In this work, we propose a single-loop algorithm, called DEEPSTORM, that achieves optimal sample complexity for this setting. Unlike double-loop algorithms that require a large batch size to compute the (stochastic) gradient once in a while, DEEPSTORM uses a small batch size, creating advantages in occasions such as streaming data and online learning. This is the first method achieving optimal sample complexity for decentralized nonconvex stochastic composite problems, requiring $\mathcal{O}(1)$ batch size. We conduct convergence analysis for DEEPSTORM with both constant and diminishing step sizes. Additionally, under proper initialization and a small enough desired solution error, we show that DEEPSTORM with a constant step size achieves a network-independent sample complexity, with an additional linear speed-up with respect to $N$ over centralized methods. All codes are made available at https://github.com/gmancino/DEEPSTORM.

下载PDF全文

下载文献需遵守相关版权规定

论文标题