在深处神经网络中产生的随机矩阵上：一般I.I.D.案件

论文标题

在深处神经网络中产生的随机矩阵上：一般I.I.D.案件

On Random Matrices Arising in Deep Neural Networks: General I.I.D. Case

论文作者

Pastur, L., Slavin, V.

论文摘要

我们研究了与深神经网络分析有关的随机矩阵产物的奇异值的分布。然而，矩阵类似于样品协方差矩阵的乘积，一个重要的区别在于，假定的种群协方差矩阵是非随机或随机的，但独立于统计和随机矩阵理论中的随机数据矩阵，现在是随机数据矩阵的某些功能（深度神经网络术语中的突触矩阵）。该问题在最近的工作[25，13]中已通过使用自由概率理论的技术。但是，自由概率理论涉及独立于数据矩阵的人群协方差矩阵，因此必须证明其适用性。使用随机矩阵理论的技术版本，使用独立条目（独立条目）的高斯数据矩阵（一个独立的自由概率分析模型）给出了理由。在本文中，我们使用了另一种更简化的随机矩阵理论技术的版本，将[22]的结果推广到突触权重矩阵的条目仅是独立分布的随机变量，其均值和有限的第四刻是独立分布的随机变量。特别是，这扩展了所谓的宏观普遍性在被考虑的随机矩阵上的特性。

We study the distribution of singular values of product of random matrices pertinent to the analysis of deep neural networks. The matrices resemble the product of the sample covariance matrices, however, an important difference is that the population covariance matrices assumed to be non-random or random but independent of the random data matrix in statistics and random matrix theory are now certain functions of random data matrices (synaptic weight matrices in the deep neural network terminology). The problem has been treated in recent work [25, 13] by using the techniques of free probability theory. Since, however, free probability theory deals with population covariance matrices which are independent of the data matrices, its applicability has to be justified. The justification has been given in [22] for Gaussian data matrices with independent entries, a standard analytical model of free probability, by using a version of the techniques of random matrix theory. In this paper we use another, more streamlined, version of the techniques of random matrix theory to generalize the results of [22] to the case where the entries of the synaptic weight matrices are just independent identically distributed random variables with zero mean and finite fourth moment. This, in particular, extends the property of the so-called macroscopic universality on the considered random matrices.

下载PDF全文

下载文献需遵守相关版权规定

论文标题