论文标题

神经网络架构超出宽度和深度

Neural Network Architecture Beyond Width and Depth

论文作者

Shen, Zuowei, Yang, Haizhao, Zhang, Shijun

论文摘要

本文提出了一种新的神经网络体系结构,它引入了一个额外的维度,称为高度和深度。具有高度,宽度和深度的神经网络体系结构称为超参数,称为三维体系结构。结果表明,具有三维体系结构的神经网络比具有二维体系结构的神经网络(仅宽度和深度为超参数的架构),例如标准完全连接的网络。新的网络体系结构是通过嵌套结构递归构建的,因此我们称其为新体系结构嵌套网络(Nestnet)的网络。高度$ s $的巢网是建造的,每个隐藏的神经元由高度$ \ le s-1 $激活的每个隐藏神经元。当$ s = 1 $时,Nestnet将退化为具有二维体系结构的标准网络。通过构造证明,高度 - $ s $ relu nestnets带有$ \ nathcal {o}(n)$参数可以在$ [0,1]^d $上大约$ 1 $ -lipschitz连续函数,带有错误$ $ \ nathcal {o}(n^{ - (s+1)/d} $,arror $ \ nathcal {o \ nathcal {s+1)$ $ \ MATHCAL {O}(N)$参数为$ \ Mathcal {O}(n^{ - 2/d})$。此外,这种结果将扩展到$ [0,1]^d $上的通用连续函数,其近似误差为特征在于连续性模量。最后,我们使用数值实验来显示Relu Nestnets超Ximation功率的优势。

This paper proposes a new neural network architecture by introducing an additional dimension called height beyond width and depth. Neural network architectures with height, width, and depth as hyper-parameters are called three-dimensional architectures. It is shown that neural networks with three-dimensional architectures are significantly more expressive than the ones with two-dimensional architectures (those with only width and depth as hyper-parameters), e.g., standard fully connected networks. The new network architecture is constructed recursively via a nested structure, and hence we call a network with the new architecture nested network (NestNet). A NestNet of height $s$ is built with each hidden neuron activated by a NestNet of height $\le s-1$. When $s=1$, a NestNet degenerates to a standard network with a two-dimensional architecture. It is proved by construction that height-$s$ ReLU NestNets with $\mathcal{O}(n)$ parameters can approximate $1$-Lipschitz continuous functions on $[0,1]^d$ with an error $\mathcal{O}(n^{-(s+1)/d})$, while the optimal approximation error of standard ReLU networks with $\mathcal{O}(n)$ parameters is $\mathcal{O}(n^{-2/d})$. Furthermore, such a result is extended to generic continuous functions on $[0,1]^d$ with the approximation error characterized by the modulus of continuity. Finally, we use numerical experimentation to show the advantages of the super-approximation power of ReLU NestNets.

扫码加入交流群

加入微信交流群

微信交流群二维码

扫码加入学术交流群,获取更多资源