论文标题
重新访问SGD型方法的中心限制定理
Revisiting the central limit theorems for the SGD-type methods
论文作者
论文摘要
我们重新审视了随机梯度下降(SGD)类型方法的中心极限定理(CLT),包括具有恒定或消失的阻尼参数的Vanilla SGD,动量SGD和Nesterov加速SGD方法。通过利用Lyapunov功能技术和$ l^p $结合估计,我们在更一般性的条件下建立了CLT的更广泛的SGD方法的学习率,与以前的结果相比。还研究了平均时间的CLT,我们发现它在线性案例中保持,而在非线性情况下通常并非如此。还进行了数值测试以验证我们的理论分析。
We revisited the central limit theorem (CLT) for stochastic gradient descent (SGD) type methods, including the vanilla SGD, momentum SGD and Nesterov accelerated SGD methods with constant or vanishing damping parameters. By taking advantage of Lyapunov function technique and $L^p$ bound estimates, we established the CLT under more general conditions on learning rates for broader classes of SGD methods compared with previous results. The CLT for the time average was also investigated, and we found that it held in the linear case, while it was not generally true in nonlinear situation. Numerical tests were also carried out to verify our theoretical analysis.