马尔可夫随机性下的随机梯度下降的有限时间分析

论文标题

马尔可夫随机性下的随机梯度下降的有限时间分析

Finite-Time Analysis of Stochastic Gradient Descent under Markov Randomness

论文作者

Doan, Thinh T., Nguyen, Lam M., Pham, Nhan H., Romberg, Justin

论文摘要

本文在强化学习和机器学习中的广泛应用中，当基础目标函数的梯度从马尔可夫流程采样时，本文考虑了流行的随机梯度下降（SGD）。这种马尔可夫采样导致梯度样本有偏见而不是独立。在马尔可夫随机性下，SGD收敛的现有结果通常是在对迭代或梯度样本的界限的假设下建立的。我们的主要重点是研究不同类型的目标功能的SGD的有限时间收敛，而无需这些假设。我们表明，SGD与Markovian梯度样品的收敛速度几乎与独立梯度样品相同。唯一的区别是一个对数因素，它解释了马尔可夫链的混合时间。

Motivated by broad applications in reinforcement learning and machine learning, this paper considers the popular stochastic gradient descent (SGD) when the gradients of the underlying objective function are sampled from Markov processes. This Markov sampling leads to the gradient samples being biased and not independent. The existing results for the convergence of SGD under Markov randomness are often established under the assumptions on the boundedness of either the iterates or the gradient samples. Our main focus is to study the finite-time convergence of SGD for different types of objective functions, without requiring these assumptions. We show that SGD converges nearly at the same rate with Markovian gradient samples as with independent gradient samples. The only difference is a logarithmic factor that accounts for the mixing time of the Markov chain.

下载PDF全文

下载文献需遵守相关版权规定

论文标题