约束生成的对抗网络不适合在数字数据增加中过度适应

论文标题

约束生成的对抗网络不适合在数字数据增加中过度适应

Restrained Generative Adversarial Network against Overfitting in Numeric Data Augmentation

论文作者

Wang, Wei, Chai, Yimeng, Cui, Tao, Wang, Chuang, Zhang, Baohua, Li, Yue, An, Yi

论文摘要

在最近的研究中，生成对抗网络（GAN）是增强图像数据集的流行方案之一。但是，在我们的研究中，我们发现GAN中的Genator G无法在较低维空间中生成数值数据，并且我们解决了这一生成的过度拟合。通过分析定向的图形模型（DGM），我们提出了一种理论约束，即损失函数的独立性，以抑制过度拟合。实际上，由于静态约束的GAN（SRGAN）和动态限制的GAN（DRGAN），提出了两个框架，以对网络结构采用理论约束。在静态结构中，我们预先定义了G和D的一对特定网络拓扑作为约束，并通过约束（SR）的可解释的度量相似性来量化这种约束。对于Drgan，我们为约束功能设计了一个可调节的辍学模块。在广泛进行的20个小组实验中，在四个公共数值类不平衡数据集和五个分类器上，静态和动态方法共同产生了20个最佳的增强结果。两种方法同时产生了20组最佳2组中的14种，证明了理论约束的有效性和可行性。

In recent studies, Generative Adversarial Network (GAN) is one of the popular schemes to augment the image dataset. However, in our study we find the generator G in the GAN fails to generate numerical data in lower-dimensional spaces, and we address overfitting in the generation. By analyzing the Directed Graphical Model (DGM), we propose a theoretical restraint, independence on the loss function, to suppress the overfitting. Practically, as the Statically Restrained GAN (SRGAN) and Dynamically Restrained GAN (DRGAN), two frameworks are proposed to employ the theoretical restraint to the network structure. In the static structure, we predefined a pair of particular network topologies of G and D as the restraint, and quantify such restraint by the interpretable metric Similarity of the Restraint (SR). While for DRGAN we design an adjustable dropout module for the restraint function. In the widely carried out 20 group experiments, on four public numerical class imbalance datasets and five classifiers, the static and dynamic methods together produce the best augmentation results of 19 from 20; and both two methods simultaneously generate 14 of 20 groups of the top-2 best, proving the effectiveness and feasibility of the theoretical restraints.

下载PDF全文

下载文献需遵守相关版权规定

论文标题