论文标题
3PC:用于沟通效率分布式培训的三点压缩机和懒惰聚集的更好理论
3PC: Three Point Compressors for Communication-Efficient Distributed Training and a Better Theory for Lazy Aggregation
论文作者
论文摘要
我们提出并研究了一种新的用于通信训练的梯度通信机制 - 三点压缩机(3PC)以及可以利用它们的有效分布式非convex优化算法。与大多数建立的方法不同,这些方法依赖于静态压缩机选择(例如,$ k $),我们的课程允许压缩机在整个培训过程中{\ em Evolve},目的是提高基本方法的理论沟通复杂性和实践效率。我们表明,我们的一般方法可以恢复最近提出的最新错误反馈机制EF21(Richtárik等,2021)及其理论特性作为一种特殊情况,但也导致许多新的有效方法。值得注意的是,我们的方法使我们能够在{\ em懒惰聚集}文学的算法和理论基础中改善最新技术的状态(Chen等,2018)。作为可能具有独立关注的副产品,我们在懒惰的聚合和错误反馈文献之间提供了一种新的和基本的联系。我们工作的一个特殊特征是,我们不需要压缩机公正。
We propose and study a new class of gradient communication mechanisms for communication-efficient training -- three point compressors (3PC) -- as well as efficient distributed nonconvex optimization algorithms that can take advantage of them. Unlike most established approaches, which rely on a static compressor choice (e.g., Top-$K$), our class allows the compressors to {\em evolve} throughout the training process, with the aim of improving the theoretical communication complexity and practical efficiency of the underlying methods. We show that our general approach can recover the recently proposed state-of-the-art error feedback mechanism EF21 (Richtárik et al., 2021) and its theoretical properties as a special case, but also leads to a number of new efficient methods. Notably, our approach allows us to improve upon the state of the art in the algorithmic and theoretical foundations of the {\em lazy aggregation} literature (Chen et al., 2018). As a by-product that may be of independent interest, we provide a new and fundamental link between the lazy aggregation and error feedback literature. A special feature of our work is that we do not require the compressors to be unbiased.