具有数据同化的新型神经网络培训框架

论文标题

具有数据同化的新型神经网络培训框架

A Novel Neural Network Training Framework with Data Assimilation

论文作者

Chen, Chong, Xing, Qinghui, Ding, Xin, Xue, Yaru, Zhong, Tianfu

论文摘要

近年来，深度学习的繁荣彻底改变了人工神经网络。但是，学习算法中梯度和离线训练机制的依赖性阻止了ANN进一步改进。在这项研究中，提出了基于数据同化的无梯度训练框架，以避免计算梯度。在数据同化算法中，预测和观察之间的误差协方差用于优化参数。前馈神经网络（FNN）分别通过梯度体面的数据同化算法（ENSEMEL KALMAN FILLEC（ENKF）（ENKF）和集合更加顺畅，分别具有多个数据同化（ESMDA））。 ESMDA通过使用所有可用的观察值更新参数来训练具有预定迭代的FNN，这些观察值可以视为离线学习。通过更新可以将其视为在线学习的参数可用的新观察结果时，ENKF优化FNN。假定两种具有正弦功能和墨西哥帽子功能的综合案例，以验证所提出的框架的有效性。均方根误差（RMSE）和确定系数（R2）用作评估不同方法的性能的标准。结果表明，所提出的训练框架的性能要比梯度体面的方法更好。拟议的框架为在线/离线培训现有的ANN（例如卷积神经网络，经常性神经网络）提供了替代方案，而无需梯度。

In recent years, the prosperity of deep learning has revolutionized the Artificial Neural Networks. However, the dependence of gradients and the offline training mechanism in the learning algorithms prevents the ANN for further improvement. In this study, a gradient-free training framework based on data assimilation is proposed to avoid the calculation of gradients. In data assimilation algorithms, the error covariance between the forecasts and observations is used to optimize the parameters. Feedforward Neural Networks (FNNs) are trained by gradient decent, data assimilation algorithms (Ensemble Kalman Filter (EnKF) and Ensemble Smoother with Multiple Data Assimilation (ESMDA)), respectively. ESMDA trains FNN with pre-defined iterations by updating the parameters using all the available observations which can be regard as offline learning. EnKF optimize FNN when new observation available by updating parameters which can be regard as online learning. Two synthetic cases with the regression of a Sine Function and a Mexican Hat function are assumed to validate the effectiveness of the proposed framework. The Root Mean Square Error (RMSE) and coefficient of determination (R2) are used as criteria to assess the performance of different methods. The results show that the proposed training framework performed better than the gradient decent method. The proposed framework provides alternatives for online/offline training the existing ANNs (e.g., Convolutional Neural Networks, Recurrent Neural Networks) without the dependence of gradients.

下载PDF全文

下载文献需遵守相关版权规定

论文标题