弗兰克·沃尔夫（Frank Wolfe）在一个隐藏层网络上的全球融合

论文标题

弗兰克·沃尔夫（Frank Wolfe）在一个隐藏层网络上的全球融合

Global Convergence of Frank Wolfe on One Hidden Layer Networks

论文作者

d'Aspremont, Alexandre, Pilanci, Mert

论文摘要

当训练一个隐藏的层神经网络时，我们为弗兰克·沃尔夫算法得出了全球融合界限。当使用relu激活函数以及在样本数据集的可拖动预处理假设下时，可逐步形成解决方案的线性最小化甲骨文可以作为二阶锥形程序明确求解。然后，经典的弗兰克·沃尔夫（Frank Wolfe）算法与费率$ o（1/t）$收敛，其中$ t $既是神经元的数量，又是对Oracle的电话数量。

We derive global convergence bounds for the Frank Wolfe algorithm when training one hidden layer neural networks. When using the ReLU activation function, and under tractable preconditioning assumptions on the sample data set, the linear minimization oracle used to incrementally form the solution can be solved explicitly as a second order cone program. The classical Frank Wolfe algorithm then converges with rate $O(1/T)$ where $T$ is both the number of neurons and the number of calls to the oracle.

下载PDF全文

下载文献需遵守相关版权规定

论文标题