论文标题
通过变异推理可视化信息瓶颈
Visualizing Information Bottleneck through Variational Inference
论文作者
论文摘要
信息瓶颈理论提供了一个理论和计算框架,用于查找近似最小统计数据。在玩具问题上对神经网络的随机梯度下降(SGD)训练的分析表明,存在两个阶段,即拟合和压缩。在这项工作中,我们分析了有关MNIST分类的深神经网络的SGD培训过程,并确认存在SGD培训的两个阶段。我们还提出了一个设置,用于通过变异推断估算深神网络的互信息。
The Information Bottleneck theory provides a theoretical and computational framework for finding approximate minimum sufficient statistics. Analysis of the Stochastic Gradient Descent (SGD) training of a neural network on a toy problem has shown the existence of two phases, fitting and compression. In this work, we analyze the SGD training process of a Deep Neural Network on MNIST classification and confirm the existence of two phases of SGD training. We also propose a setup for estimating the mutual information for a Deep Neural Network through Variational Inference.