噪声尾巴如何影响深度恢复网络？

论文标题

噪声尾巴如何影响深度恢复网络？

How do noise tails impact on deep ReLU networks?

论文作者

Fan, Jianqing, Gu, Yihong, Zhou, Wen-Xin

论文摘要

本文研究了噪声仅具有有限的p-thism时刻，研究了深层神经网络对非参数回归的稳定性。我们揭露了最佳收敛速率如何取决于P，当使用自适应Huber损失和深层Relu神经网络时，一类非参数回归函数的平滑度和内在维度具有分层组成结构。这种最佳的收敛速率无法通过普通最小二乘正方形获得，但是可以通过适当选择的参数来实现，该参数适合样本大小，平滑度和力矩参数。还提出了具有允许优化错误的自适应Huber Relu神经网络估计器的浓度不平等。为了在神经网络估计器类别中使用Huber损失建立匹配的下限，我们采用了与传统途径不同的策略：构建与真实功能更好的经验损失和这两个功能之间的差异相比，它具有低界限。此步骤与Huberization偏差有关，但与深度Relu网络的近似性更为严重。结果，我们还为深度依赖神经网络的近似理论做出了一些新的结果。

This paper investigates the stability of deep ReLU neural networks for nonparametric regression under the assumption that the noise has only a finite p-th moment. We unveil how the optimal rate of convergence depends on p, the degree of smoothness and the intrinsic dimension in a class of nonparametric regression functions with hierarchical composition structure when both the adaptive Huber loss and deep ReLU neural networks are used. This optimal rate of convergence cannot be obtained by the ordinary least squares but can be achieved by the Huber loss with a properly chosen parameter that adapts to the sample size, smoothness, and moment parameters. A concentration inequality for the adaptive Huber ReLU neural network estimators with allowable optimization errors is also derived. To establish a matching lower bound within the class of neural network estimators using the Huber loss, we employ a different strategy from the traditional route: constructing a deep ReLU network estimator that has a better empirical loss than the true function and the difference between these two functions furnishes a low bound. This step is related to the Huberization bias, yet more critically to the approximability of deep ReLU networks. As a result, we also contribute some new results on the approximation theory of deep ReLU neural networks.

下载PDF全文

下载文献需遵守相关版权规定

论文标题