深度神经网络中的重量和梯度集中化

论文标题

深度神经网络中的重量和梯度集中化

Weight and Gradient Centralization in Deep Neural Networks

论文作者

Fuhl, Wolfgang, Kasneci, Enkelejda

论文摘要

当前，批处理是深层神经网络内部标准化的最广泛使用的变体。其他工作表明，权重的归一化和其他条件以及梯度的归一化进一步改善了概括。在这项工作中，我们结合了其中几种方法，从而增加了网络的概括。与批处理归一化相比，较新方法的优点不仅增加了概括，而且这些方法仅在训练过程中必须应用，因此不会影响使用过程中的运行时间。链接到CUDA代码https://atreus.informatik.uni-tuebingen.de/seafile/d/8e2ab8c3fdd4444e1a135/

Batch normalization is currently the most widely used variant of internal normalization for deep neural networks. Additional work has shown that the normalization of weights and additional conditioning as well as the normalization of gradients further improve the generalization. In this work, we combine several of these methods and thereby increase the generalization of the networks. The advantage of the newer methods compared to the batch normalization is not only increased generalization, but also that these methods only have to be applied during training and, therefore, do not influence the running time during use. Link to CUDA code https://atreus.informatik.uni-tuebingen.de/seafile/d/8e2ab8c3fdd444e1a135/

下载PDF全文

下载文献需遵守相关版权规定

论文标题