论文标题
贝叶斯深度学习与多级微量级神经网络
Bayesian Deep Learning with Multilevel Trace-class Neural Networks
论文作者
论文摘要
在本文中,我们考虑了与深神经网络(DNN)相关的贝叶斯推断,尤其是痕迹级神经网络(TNN)先验,可以比传统的DNN更可取,因为它们是(a)它们是可识别的,并且(b)它们具有可取的收敛性。 TNN先验是在无限多个隐藏单元的功能上定义的,并且具有强烈收敛的Karhunen-Loeve型近似值,并且有限许多隐藏的单元。一个实用的障碍是,贝叶斯解决方案在计算上需要模拟方法,因此需要降低复杂性的方法。在本文中,我们利用TNN的强收敛来将多级蒙特卡洛(MLMC)应用于这些模型。特别是,引入的MLMC方法用于近似具有最佳计算复杂性的贝叶斯TNN模型的后验预期,并且在数学上证明了这一点。通过几个数值实验对机器学习产生的模型问题进行验证,包括回归,分类和增强学习。
In this article we consider Bayesian inference associated to deep neural networks (DNNs) and in particular, trace-class neural network (TNN) priors which can be preferable to traditional DNNs as (a) they are identifiable and (b) they possess desirable convergence properties. TNN priors are defined on functions with infinitely many hidden units, and have strongly convergent Karhunen-Loeve-type approximations with finitely many hidden units. A practical hurdle is that the Bayesian solution is computationally demanding, requiring simulation methods, so approaches to drive down the complexity are needed. In this paper, we leverage the strong convergence of TNN in order to apply Multilevel Monte Carlo (MLMC) to these models. In particular, an MLMC method that was introduced is used to approximate posterior expectations of Bayesian TNN models with optimal computational complexity, and this is mathematically proved. The results are verified with several numerical experiments on model problems arising in machine learning, including regression, classification, and reinforcement learning.