非平滑非凸nonConcave minimax优化：原始二平衡和迭代复杂性分析

论文标题

非平滑非凸nonConcave minimax优化：原始二平衡和迭代复杂性分析

Nonsmooth Nonconvex-Nonconcave Minimax Optimization: Primal-Dual Balancing and Iteration Complexity Analysis

论文作者

Li, Jiajin, Zhu, Linglingzhi, So, Anthony Man-Cho

论文摘要

在过去的十年中，NonConvex-Nonconcave Minimax优化已引起了广泛的兴趣。但是，大多数现有的作品都集中在梯度下降（GDA）算法的变体上，这些变体仅适用于平滑的非convex-concave设置。为了解决这一限制，我们提出了一种新型算法，称为平滑的近端线性下降（平滑PLDA），该算法可以有效地处理一系列结构化的非convex-nonconcave-nonconcave-nonconcave minimax问题。具体而言，我们考虑原始函数具有非平滑复合结构的设置，并且双重功能具有kurdyka-lojasiewicz（kl）属性，具有指数$θ\ in [0,1）$。我们为平滑的PLDA引入了一种新颖的收敛分析框架，其关键组件是我们新开发的非平滑原始误差绑定和双重误差的结合。使用此框架，我们表明，平滑的PLDA可以找到$ε$ - 游戏机站点和$ε$ -Optimization-optimization-stationary点的$ \ MATHCAL {O}中感兴趣的问题（ε^{ - 2 \ max \ max \ {2θ，1 \}，1 \}}}）$ iterations $ iterations $ iterations $ iterations $ iterations $ iterations。此外，当$θ\在[0，\ frac {1} {2}] $中时，平滑的plda实现了$ \ mathcal {o}的最佳迭代复杂性（ε^{ - 2}）$。为了进一步证明我们的分析框架的有效性和广泛的适用性，我们表明某些最大结构化问题具有kl属性，具有指数$θ= 0 $在轻度假设下。作为副产品，我们在各种平稳性概念之间建立了算法无关的定量关系，这可能具有独立的利益。

Nonconvex-nonconcave minimax optimization has gained widespread interest over the last decade. However, most existing works focus on variants of gradient descent-ascent (GDA) algorithms, which are only applicable to smooth nonconvex-concave settings. To address this limitation, we propose a novel algorithm named smoothed proximal linear descent-ascent (smoothed PLDA), which can effectively handle a broad range of structured nonsmooth nonconvex-nonconcave minimax problems. Specifically, we consider the setting where the primal function has a nonsmooth composite structure and the dual function possesses the Kurdyka-Lojasiewicz (KL) property with exponent $θ\in [0,1)$. We introduce a novel convergence analysis framework for smoothed PLDA, the key components of which are our newly developed nonsmooth primal error bound and dual error bound. Using this framework, we show that smoothed PLDA can find both $ε$-game-stationary points and $ε$-optimization-stationary points of the problems of interest in $\mathcal{O}(ε^{-2\max\{2θ,1\}})$ iterations. Furthermore, when $θ\in [0,\frac{1}{2}]$, smoothed PLDA achieves the optimal iteration complexity of $\mathcal{O}(ε^{-2})$. To further demonstrate the effectiveness and wide applicability of our analysis framework, we show that certain max-structured problem possesses the KL property with exponent $θ=0$ under mild assumptions. As a by-product, we establish algorithm-independent quantitative relationships among various stationarity concepts, which may be of independent interest.

下载PDF全文

下载文献需遵守相关版权规定

论文标题