论文标题
从山沟方法到Nesterov方法,反之亦然:动态系统的观点
From the Ravine method to the Nesterov method and vice versa: a dynamical system perspective
论文作者
论文摘要
我们从动力学系统的角度重新审视了Gelfand和Tsetlin的沟渠方法,研究其收敛属性,并强调了其与Nesterov加速梯度方法的相似性和差异。这两种方法密切相关。可以通过在定义中逆转外推和梯度操作的顺序来互相推导。它们受益于对一般凸目标函数的值的类似的快速收敛和迭代的迭代收敛性。我们还将建立Ravine和Nesterov方法的高分辨率颂歌,并揭示由Hessian驱动的两种方法的其他几何阻尼项。这将使我们能够证明不仅对于沟渠方法,而且是Nesterov方法的零梯度的快速收敛。我们还突出了与更微妙的离散方案相关的其他算法的连接,并最终描述了用于一般结构平滑 +非平滑凸优化问题的近端梯度算法的山沟版本。
We revisit the Ravine method of Gelfand and Tsetlin from a dynamical system perspective, study its convergence properties, and highlight its similarities and differences with the Nesterov accelerated gradient method. The two methods are closely related. They can be deduced from each other by reversing the order of the extrapolation and gradient operations in their definitions. They benefit from similar fast convergence of values and convergence of iterates for general convex objective functions. We will also establish the high resolution ODE of the Ravine and Nesterov methods, and reveal an additional geometric damping term driven by the Hessian for both methods. This will allow us to prove fast convergence towards zero of the gradients not only for the Ravine method but also for the Nesterov method for the first time. We also highlight connections to other algorithms stemming from more subtle discretization schemes, and finally describe a Ravine version of the proximal-gradient algorithms for general structured smooth + non-smooth convex optimization problems.