论文标题
一般自适应随机镜下降的差异
Variance Reduction on General Adaptive Stochastic Mirror Descent
论文作者
论文摘要
在这项工作中,我们通过在非滑动非凸有限和有限的优化问题中使用一般自适应镜下降算法研究其性能来研究降低方差的想法。我们提出了一个简单而广义的框架,用于降低名为SVRAMD的自适应镜下降算法,并在非滑动非凸问题和P-L条件问题中提供其收敛分析。我们证明,降低差异会降低自适应镜下降算法的SFO复杂性,从而加速其收敛性。特别是,我们的一般理论意味着,可以使用时变的步骤大小和自适应算法(如Adagrad和RMSProp)应用差异。此外,SVRAMD的收敛速率恢复了非自适应方差的最佳现有速率减少了镜下下降算法,而没有复杂的算法组件。深度学习的广泛实验验证了我们的理论发现。
In this work, we investigate the idea of variance reduction by studying its properties with general adaptive mirror descent algorithms in nonsmooth nonconvex finite-sum optimization problems. We propose a simple yet generalized framework for variance reduced adaptive mirror descent algorithms named SVRAMD and provide its convergence analysis in both the nonsmooth nonconvex problem and the P-L conditioned problem. We prove that variance reduction reduces the SFO complexity of adaptive mirror descent algorithms and thus accelerates their convergence. In particular, our general theory implies that variance reduction can be applied to algorithms using time-varying step sizes and self-adaptive algorithms such as AdaGrad and RMSProp. Moreover, the convergence rates of SVRAMD recover the best existing rates of non-adaptive variance reduced mirror descent algorithms without complicated algorithmic components. Extensive experiments in deep learning validate our theoretical findings.