对馈电reu网络解决方案的理论探索

论文标题

对馈电reu网络解决方案的理论探索

Theoretical Exploration of Solutions of Feedforward ReLU Networks

论文作者

Huang, Changcun

论文摘要

本文旨在通过从基本规则中扣除来探索其分段线性函数的解决方案来解释FeedForward Relu网络的机制。构造的解决方案应足够通用，以解释某些工程的网络体系结构；为此，提供了多种方法来增强解决方案通用性。我们理论的某些后果包括：在仿射几何背景下，给出了三层网络和深层网络的解决方案，尤其是对于实践中应用的那些体系结构，例如多层馈电神经网络和解码器；我们对网络体系结构的每个组成部分进行清晰而直观的解释；研究了多输出的参数共享机制；我们提供了过度参数解决方案的解释。在我们的框架下，与较浅层相比，深层的优势是自然而然的。一些中间结果是神经网络建模或理解的基本知识，例如嵌入在高维空间中的数据的分类，仿射变换的概括，矩阵等级的概率模型以及可区分数据集的概念。

This paper aims to interpret the mechanism of feedforward ReLU networks by exploring their solutions for piecewise linear functions, through the deduction from basic rules. The constructed solution should be universal enough to explain some network architectures of engineering; in order for that, several ways are provided to enhance the solution universality. Some of the consequences of our theories include: Under affine-geometry background, the solutions of both three-layer networks and deep-layer networks are given, particularly for those architectures applied in practice, such as multilayer feedforward neural networks and decoders; We give clear and intuitive interpretations of each component of network architectures; The parameter-sharing mechanism for multi-outputs is investigated; We provide an explanation of overparameterization solutions in terms of affine transforms; Under our framework, an advantage of deep layers compared to shallower ones is natural to be obtained. Some intermediate results are the basic knowledge for the modeling or understanding of neural networks, such as the classification of data embedded in a higher-dimensional space, the generalization of affine transforms, the probabilistic model of matrix ranks, and the concept of distinguishable data sets.

下载PDF全文

下载文献需遵守相关版权规定

论文标题