论文标题
SVM的流式复杂性
Streaming Complexity of SVMs
论文作者
论文摘要
我们研究了在流模型中解决偏置调查的SVM问题的空间复杂性。这是一个经典的监督学习问题,引起了很多关注,包括开发快速算法以大约解决该问题。使用最广泛使用的算法之一是优化SVM目标的算法之一是随机梯度下降(SGD),它仅需要$ O(\ frac {1} {λε})$随机样品,并且立即产生流的算法,该算法使用$ o(\ frac {\ frac {d} d} {d} $ sack $ saspace} $ sacke} $ saspe)。对于相关问题,与我们在这项工作中关注的SVM目标不同,更好的流算法仅以平滑功能而闻名。我们启动对空间复杂性的调查,以找到该目标的近似最佳选择,以及相关的``点估计''问题,即在任何查询$(θ,b)$上绘制数据集以评估功能值$f_λ$的问题。我们表明,对于这两个问题,对于尺寸$ d = 1,2 $,可以获得多个空间的流算法小于$ \ frac {1} {1} {λε} $,这是SGD的复杂性,对于强烈的convex函数,像偏见为bias Regulined SVM一样,已经知道这是$ d = 1 $ d = 1 $ d = 1 $ d = 1 $。我们还证明了点估计和优化的多项式下限。特别是,对于点估计,我们获得了$ d = 1 $的$θ(1/\sqrtε)$的紧密界限,$ d =ω(\ log(1/ε))$的$ d = 1 $和几乎紧密的下限。最后,为了优化,我们证明了$ω(1/\sqrtε)$下限$ d =ω(\ log(1/ε))$,并在$ d $是常数时显示相似的界限。
We study the space complexity of solving the bias-regularized SVM problem in the streaming model. This is a classic supervised learning problem that has drawn lots of attention, including for developing fast algorithms for solving the problem approximately. One of the most widely used algorithms for approximately optimizing the SVM objective is Stochastic Gradient Descent (SGD), which requires only $O(\frac{1}{λε})$ random samples, and which immediately yields a streaming algorithm that uses $O(\frac{d}{λε})$ space. For related problems, better streaming algorithms are only known for smooth functions, unlike the SVM objective that we focus on in this work. We initiate an investigation of the space complexity for both finding an approximate optimum of this objective, and for the related ``point estimation'' problem of sketching the data set to evaluate the function value $F_λ$ on any query $(θ, b)$. We show that, for both problems, for dimensions $d=1,2$, one can obtain streaming algorithms with space polynomially smaller than $\frac{1}{λε}$, which is the complexity of SGD for strongly convex functions like the bias-regularized SVM, and which is known to be tight in general, even for $d=1$. We also prove polynomial lower bounds for both point estimation and optimization. In particular, for point estimation we obtain a tight bound of $Θ(1/\sqrtε)$ for $d=1$ and a nearly tight lower bound of $\widetildeΩ(d/ε^2)$ for $d = Ω( \log(1/ε))$. Finally, for optimization, we prove a $Ω(1/\sqrtε)$ lower bound for $d = Ω( \log(1/ε))$, and show similar bounds when $d$ is constant.