论文标题
导师:使用决策规则作为模型先验的培训神经网络
TUTOR: Training Neural Networks Using Decision Rules as Model Priors
论文作者
论文摘要
人脑有能力执行经验有限的新任务。它利用先前的学习经验使解决方案策略适应新领域。另一方面,深度神经网络(DNN)通常需要大量的数据和计算资源来培训。但是,在许多情况下都无法满足这一要求。为了应对这些挑战,我们提出了导师DNN合成框架。导师针对表格数据集。它综合了具有有限的可用数据和减少内存/计算要求的准确DNN模型。它由三个顺序步骤组成。第一步涉及合成数据的生成,验证和标记。合成数据生成模块既靶向分类和连续特征。导师从与实际数据相同的概率分布中生成合成数据。然后,它使用语义完整性分类器模块验证生成的合成数据的完整性。它根据从实际数据集提取的一组规则标记合成数据。接下来,导师使用两个结合合成数据和训练数据的培训方案来学习DNN模型的参数。这两个方案着重于两种不同的方式,可以使用合成数据来获得模型参数的先验,因此为使用真实数据提供了更好的DNN初始化。在第三步中,导师采用成长和促进的综合范式来学习DNN的权重和体系结构,以减少模型大小,同时确保其准确性。我们评估了导师在各种大小的九个数据集上的性能。我们表明,与完全连接的DNN相比,导师平均将数据的需求降低了5.9倍,将准确性提高了3.4%,并将参数数量(FFLOPS)降低了4.7倍(4.3倍)。因此,导师可以减少数据渴望,更准确,更紧凑的DNN合成。
The human brain has the ability to carry out new tasks with limited experience. It utilizes prior learning experiences to adapt the solution strategy to new domains. On the other hand, deep neural networks (DNNs) generally need large amounts of data and computational resources for training. However, this requirement is not met in many settings. To address these challenges, we propose the TUTOR DNN synthesis framework. TUTOR targets tabular datasets. It synthesizes accurate DNN models with limited available data and reduced memory/computational requirements. It consists of three sequential steps. The first step involves generation, verification, and labeling of synthetic data. The synthetic data generation module targets both the categorical and continuous features. TUTOR generates the synthetic data from the same probability distribution as the real data. It then verifies the integrity of the generated synthetic data using a semantic integrity classifier module. It labels the synthetic data based on a set of rules extracted from the real dataset. Next, TUTOR uses two training schemes that combine synthetic and training data to learn the parameters of the DNN model. These two schemes focus on two different ways in which synthetic data can be used to derive a prior on the model parameters and, hence, provide a better DNN initialization for training with real data. In the third step, TUTOR employs a grow-and-prune synthesis paradigm to learn both the weights and the architecture of the DNN to reduce model size while ensuring its accuracy. We evaluate the performance of TUTOR on nine datasets of various sizes. We show that in comparison to fully connected DNNs, TUTOR, on an average, reduces the need for data by 5.9x, improves accuracy by 3.4%, and reduces the number of parameters (fFLOPs) by 4.7x (4.3x). Thus, TUTOR enables a less data-hungry, more accurate, and more compact DNN synthesis.