广泛神经网络中蒙特卡洛辍学的特征

论文标题

广泛神经网络中蒙特卡洛辍学的特征

Characteristics of Monte Carlo Dropout in Wide Neural Networks

论文作者

Sicking, Joachim, Akila, Maram, Wirtz, Tim, Houben, Sebastian, Fischer, Asja

论文摘要

蒙特卡洛（MC）辍学是神经网络（NNS）中不确定性估计的最新方法之一。它被解释为大约执行的贝叶斯推断。基于以前对具有随机权重的宽和深神经网络对高斯过程近似的工作，我们更严格地研究了辍学下的宽带NN的限制分布，并证明它们不同时将其融合到高斯流程中，以使其为固定的权重和偏见集融合。我们概述了一个论点，即该属性也可能适用于经过（全批）梯度下降训练的无限宽的前馈网络。该理论与经验分析形成对比，在这种经验分析中，我们发现有限宽度NN的前激活的相关性和非高斯行为。因此，我们研究了（强）相关的前激活如何诱导与重量密切相关的NNS中的非高斯行为。

Monte Carlo (MC) dropout is one of the state-of-the-art approaches for uncertainty estimation in neural networks (NNs). It has been interpreted as approximately performing Bayesian inference. Based on previous work on the approximation of Gaussian processes by wide and deep neural networks with random weights, we study the limiting distribution of wide untrained NNs under dropout more rigorously and prove that they as well converge to Gaussian processes for fixed sets of weights and biases. We sketch an argument that this property might also hold for infinitely wide feed-forward networks that are trained with (full-batch) gradient descent. The theory is contrasted by an empirical analysis in which we find correlations and non-Gaussian behaviour for the pre-activations of finite width NNs. We therefore investigate how (strongly) correlated pre-activations can induce non-Gaussian behavior in NNs with strongly correlated weights.

下载PDF全文

下载文献需遵守相关版权规定

论文标题