拜占庭式弹性随机梯度下降的简化收敛理论

论文标题

拜占庭式弹性随机梯度下降的简化收敛理论

A simplified convergence theory for Byzantine resilient stochastic gradient descent

论文作者

Roberts, Lindon, Smyth, Edward

论文摘要

在分布式学习中，中央服务器根据持有本地数据样本的节点提供的更新来训练模型。在一个或多个恶意服务器的存在下发送不正确信息（拜占庭对手），用于模型训练的标准算法（例如随机梯度下降（SGD））无法收敛。在本文中，我们为Blanchard等人最初提出的通用拜占庭弹性SGD方法提供了一种简化的收敛理论。 [Neurips 2017]。与现有分析相比，我们在（可能是非convex）目标函数（可能是非convex）的标准假设和随机梯度上的灵活假设上表明了收敛到固定点的固定点。

In distributed learning, a central server trains a model according to updates provided by nodes holding local data samples. In the presence of one or more malicious servers sending incorrect information (a Byzantine adversary), standard algorithms for model training such as stochastic gradient descent (SGD) fail to converge. In this paper, we present a simplified convergence theory for the generic Byzantine Resilient SGD method originally proposed by Blanchard et al. [NeurIPS 2017]. Compared to the existing analysis, we shown convergence to a stationary point in expectation under standard assumptions on the (possibly nonconvex) objective function and flexible assumptions on the stochastic gradients.

下载PDF全文

下载文献需遵守相关版权规定

论文标题