论文标题
通过Langevin算法的随机近似梯度下降
Stochastic Approximate Gradient Descent via the Langevin Algorithm
论文作者
论文摘要
我们介绍了一种新型,有效的算法,称为随机梯度下降(SAGD),作为无法毫无偏见的随机梯度的情况,可以替代随机梯度下降。此类问题的传统方法依赖于通用采样技术,例如马尔可夫链蒙特卡洛,该技术通常需要手动干预来调整参数,并且在实践中无法有效地工作。取而代之的是,SAGD利用Langevin算法来构造以有限步骤但渐近准确的随机梯度构建随机梯度,从而使我们能够理论上建立SAGD的收敛保证。受理论分析的启发,我们还为其实际实施提供了有用的准则。最后,我们表明,在流行的统计和机器学习问题(例如期望最大化算法和变异自动编码器)中,SAGD在实验中表现良好。
We introduce a novel and efficient algorithm called the stochastic approximate gradient descent (SAGD), as an alternative to the stochastic gradient descent for cases where unbiased stochastic gradients cannot be trivially obtained. Traditional methods for such problems rely on general-purpose sampling techniques such as Markov chain Monte Carlo, which typically requires manual intervention for tuning parameters and does not work efficiently in practice. Instead, SAGD makes use of the Langevin algorithm to construct stochastic gradients that are biased in finite steps but accurate asymptotically, enabling us to theoretically establish the convergence guarantee for SAGD. Inspired by our theoretical analysis, we also provide useful guidelines for its practical implementation. Finally, we show that SAGD performs well experimentally in popular statistical and machine learning problems such as the expectation-maximization algorithm and the variational autoencoders.