结构探测神经网络通缩

论文标题

结构探测神经网络通缩

Structure Probing Neural Network Deflation

论文作者

Gu, Yiqi, Wang, Chunmei, Yang, Haizhao

论文摘要

深度学习是解决非线性微分方程的强大工具，但通常，由于随机梯度下降的隐式正则化，只能找到与最平坦的局部最小化器相对应的解决方案。本文提出了一种基于网络的结构探测通缩方法，以使深度学习能够识别多种在非线性物理模型中无处不在且重要的解决方案。首先，我们介绍了使用已知解决方案构建的通缩操作员，以使已知的解决方案不再是优化能量景观的本地最小化。其次，为了促进与所需的局部最小化器的收敛性，提出了一种结构探测技术，以获得接近所需的局部最小化器的初始猜测。与本文精心设计的神经网络结构一起，新的正则优化可以有效地收敛到新的解决方案。由于深度学习的无网格性质，该提出的方法能够在具有多个解决方案的复杂域上解决高维问题，而现有方法仅着眼于一个或二维的常规域，并且在运行计数上更昂贵。数值实验还表明，所提出的方法比退出方法可以找到更多的解决方案。

Deep learning is a powerful tool for solving nonlinear differential equations, but usually, only the solution corresponding to the flattest local minimizer can be found due to the implicit regularization of stochastic gradient descent. This paper proposes a network-based structure probing deflation method to make deep learning capable of identifying multiple solutions that are ubiquitous and important in nonlinear physical models. First, we introduce deflation operators built with known solutions to make known solutions no longer local minimizers of the optimization energy landscape. Second, to facilitate the convergence to the desired local minimizer, a structure probing technique is proposed to obtain an initial guess close to the desired local minimizer. Together with neural network structures carefully designed in this paper, the new regularized optimization can converge to new solutions efficiently. Due to the mesh-free nature of deep learning, the proposed method is capable of solving high-dimensional problems on complicated domains with multiple solutions, while existing methods focus on merely one or two-dimensional regular domains and are more expensive in operation counts. Numerical experiments also demonstrate that the proposed method could find more solutions than exiting methods.

下载PDF全文

下载文献需遵守相关版权规定

论文标题