论文标题

深神经网络中产生的大型随机矩阵的特征值分布:正交情况

Eigenvalue Distribution of Large Random Matrices Arising in Deep Neural Networks: Orthogonal Case

论文作者

Pastur, Leonid

论文摘要

该论文介绍了深度未经训练的神经网络的投入输出雅各布的奇异值的分布。 Jacobian是随机矩阵的产物,其中独立的矩形重量矩阵与对角矩阵交替出现,其条目取决于最近邻居重量矩阵的相应列。在\ cite {pe-co:18}中考虑了高斯的重量和偏见,以及HAAR分布正交矩阵和高斯偏见的权重。基于自由概率论点,据称在这些情况下,雅各布在无限宽度极限(矩阵大小)中的奇异值分布与雅各布式的类似物的类似物具有特殊的随机但重量独立的对角线矩阵,这是随机矩阵理论中众所周知的。该主张在\ cite {pa-sl:21}中进行了严格证明,因为与i.i.d的权重和偏见相当一般。 (包括高斯)条目通过使用随机矩阵理论的技术版本。在本文中,我们使用另一种版本的技术来证明随机HAAR分布式权重矩阵和高斯偏见的索赔合理。

The paper deals with the distribution of singular values of the input-output Jacobian of deep untrained neural networks in the limit of their infinite width. The Jacobian is the product of random matrices where the independent rectangular weight matrices alternate with diagonal matrices whose entries depend on the corresponding column of the nearest neighbor weight matrix. The problem was considered in \cite{Pe-Co:18} for the Gaussian weights and biases and also for the weights that are Haar distributed orthogonal matrices and Gaussian biases. Basing on a free probability argument, it was claimed that in these cases the singular value distribution of the Jacobian in the limit of infinite width (matrix size) coincides with that of the analog of the Jacobian with special random but weight independent diagonal matrices, the case well known in random matrix theory. The claim was rigorously proved in \cite{Pa-Sl:21} for a quite general class of weights and biases with i.i.d. (including Gaussian) entries by using a version of the techniques of random matrix theory. In this paper we use another version of the techniques to justify the claim for random Haar distributed weight matrices and Gaussian biases.

扫码加入交流群

加入微信交流群

微信交流群二维码

扫码加入学术交流群,获取更多资源