通过对双重参数化的差异学习率的隐式偏见的稳健恢复

论文标题

通过对双重参数化的差异学习率的隐式偏见的稳健恢复

Robust Recovery via Implicit Bias of Discrepant Learning Rates for Double Over-parameterization

论文作者

You, Chong, Zhu, Zhihui, Qu, Qing, Ma, Yi

论文摘要

最近的进步表明，梯度下降对过度参数化模型的隐式偏差可以使低级矩阵从线性测量中恢复，即使对固有等级没有先验知识。相比之下，对于从严重损坏的测量中恢复了强大的低级矩阵，过度参数会导致过度适应，而没有对腐败的内在等级和稀疏性的先验知识。本文表明，对于低级矩阵和稀疏腐败的双重参数化，即使没有对矩阵的矩阵的先验知识，也没有腐败的稀疏性，也证明没有差异学习率的梯度下降也可以恢复基础矩阵。我们通过使用深卷积网络过度参数化图像来进一步扩展了对自然图像进行稳健恢复的方法。实验表明，我们的方法通过单个学习管道处理不同的测试图像和变化的损坏水平，其中网络宽度和终止条件无需逐案调整。成功的基础再次是在不同参数化参数上具有差异学习率的隐式偏见，这可能在更广泛的应用程序上。

Recent advances have shown that implicit bias of gradient descent on over-parameterized models enables the recovery of low-rank matrices from linear measurements, even with no prior knowledge on the intrinsic rank. In contrast, for robust low-rank matrix recovery from grossly corrupted measurements, over-parameterization leads to overfitting without prior knowledge on both the intrinsic rank and sparsity of corruption. This paper shows that with a double over-parameterization for both the low-rank matrix and sparse corruption, gradient descent with discrepant learning rates provably recovers the underlying matrix even without prior knowledge on neither rank of the matrix nor sparsity of the corruption. We further extend our approach for the robust recovery of natural images by over-parameterizing images with deep convolutional networks. Experiments show that our method handles different test images and varying corruption levels with a single learning pipeline where the network width and termination conditions do not need to be adjusted on a case-by-case basis. Underlying the success is again the implicit bias with discrepant learning rates on different over-parameterized parameters, which may bear on broader applications.

下载PDF全文

下载文献需遵守相关版权规定

论文标题