加权梯度编码和杠杆评分抽样

论文标题

加权梯度编码和杠杆评分抽样

Weighted Gradient Coding with Leverage Score Sampling

论文作者

Charalambides, Neophytos, Pilanci, Mert, Hero III, Alfred O.

论文摘要

机器学习的主要障碍是对大量数据集的可扩展性。克服此障碍的方法包括压缩数据矩阵和分发计算。 \ textit {Leverage分数采样}使用重要的加权子集提供了数据矩阵的压缩近似。 \ textIt {梯度编码}最近在分布式优化中提出了使用多个不可靠的工人节点来计算梯度。通过设计编码矩阵，可以使梯度编码计算具有弹性的弹性器，后者是降低系统性能的分布式网络中的节点。我们提出了一种新颖的\ textit {加权杠杆分数}方法，该方法通过使用重要性采样来提高分布式梯度编码的性能。

A major hurdle in machine learning is scalability to massive datasets. Approaches to overcome this hurdle include compression of the data matrix and distributing the computations. \textit{Leverage score sampling} provides a compressed approximation of a data matrix using an importance weighted subset. \textit{Gradient coding} has been recently proposed in distributed optimization to compute the gradient using multiple unreliable worker nodes. By designing coding matrices, gradient coded computations can be made resilient to stragglers, which are nodes in a distributed network that degrade system performance. We present a novel \textit{weighted leverage score} approach, that achieves improved performance for distributed gradient coding by utilizing an importance sampling.

下载PDF全文

下载文献需遵守相关版权规定

论文标题