具有更快的显式模型识别的加速双随机梯度方法

论文标题

具有更快的显式模型识别的加速双随机梯度方法

An Accelerated Doubly Stochastic Gradient Method with Faster Explicit Model Identification

论文作者

Bao, Runxue, Gu, Bin, Huang, Heng

论文摘要

稀疏正规化损失最小化问题在包括机器学习，数据挖掘和现代统计的各个领域起着重要作用。近端梯度下降法和坐标下降法是解决最小化问题的最流行方法。尽管现有方法可以实现隐式模型识别，但在有限的迭代中，又称支持集合识别，这些方法在高维情况下仍然遭受巨大的计算成本和内存负担。原因是这些方法中的支持集识别是隐式的，因此在实践中无法明确识别低复杂性结构，即，它们无法通过降低维度删除相关特征的无用系数，以实现算法加速。为了应对这一挑战，我们提出了一种新型的加速双随机梯度下降（ADSGD）方法，用于稀疏性损失最小化问题，可以通过在优化过程中消除非活性系数来减少块迭代的数量，并最终实现更快的显式模型识别和提高算法效率。从理论上讲，我们首先证明ADSGD可以达到线性收敛速率并降低总体计算复杂性。更重要的是，我们证明ADSGD可以实现明确模型识别的线性速率。从数值上讲，基准数据集上的实验结果证实了我们提出的方法的效率。

Sparsity regularized loss minimization problems play an important role in various fields including machine learning, data mining, and modern statistics. Proximal gradient descent method and coordinate descent method are the most popular approaches to solving the minimization problem. Although existing methods can achieve implicit model identification, aka support set identification, in a finite number of iterations, these methods still suffer from huge computational costs and memory burdens in high-dimensional scenarios. The reason is that the support set identification in these methods is implicit and thus cannot explicitly identify the low-complexity structure in practice, namely, they cannot discard useless coefficients of the associated features to achieve algorithmic acceleration via dimension reduction. To address this challenge, we propose a novel accelerated doubly stochastic gradient descent (ADSGD) method for sparsity regularized loss minimization problems, which can reduce the number of block iterations by eliminating inactive coefficients during the optimization process and eventually achieve faster explicit model identification and improve the algorithm efficiency. Theoretically, we first prove that ADSGD can achieve a linear convergence rate and lower overall computational complexity. More importantly, we prove that ADSGD can achieve a linear rate of explicit model identification. Numerically, experimental results on benchmark datasets confirm the efficiency of our proposed method.

下载PDF全文

下载文献需遵守相关版权规定

论文标题