双重共卷积神经网络

论文标题

双重共卷积神经网络

Dual Convexified Convolutional Neural Networks

论文作者

Bai, Site, Ke, Chuyang, Honorio, Jean

论文摘要

我们提出了双重共卷卷积神经网络（DCCNN）的框架。在此框架中，我们首先引入了一个由凸的卷积神经网络（CCNN）促进的原始学习问题，然后通过仔细分析Karush-Kuhn-Tucker（KKT）条件和Fenchel Conjugates来构建双凸培训计划。我们的方法减少了构建大核矩阵的计算开销，更重要的是消除了分解矩阵的歧义。由于CCNN中的较低级别结构和核规范的相关亚分化，因此没有封闭形式的表达来从双重溶液中恢复原始溶液。为了克服这一点，我们提出了一种高度新颖的重量恢复算法，该算法将双重解和内核信息作为输入，并恢复卷积层的线性重量和输出，而不是重量参数。此外，我们的恢复算法利用低级结构并间接施加了少量过滤器，从而降低了参数大小。结果，DCCNN继承了CCNN的所有统计益处，同时享受了更正式和有效的工作流程。

We propose the framework of dual convexified convolutional neural networks (DCCNNs). In this framework, we first introduce a primal learning problem motivated by convexified convolutional neural networks (CCNNs), and then construct the dual convex training program through careful analysis of the Karush-Kuhn-Tucker (KKT) conditions and Fenchel conjugates. Our approach reduces the computational overhead of constructing a large kernel matrix and more importantly, eliminates the ambiguity of factorizing the matrix. Due to the low-rank structure in CCNNs and the related subdifferential of nuclear norms, there is no closed-form expression to recover the primal solution from the dual solution. To overcome this, we propose a highly novel weight recovery algorithm, which takes the dual solution and the kernel information as the input, and recovers the linear weight and the output of convolutional layer, instead of weight parameter. Furthermore, our recovery algorithm exploits the low-rank structure and imposes a small number of filters indirectly, which reduces the parameter size. As a result, DCCNNs inherit all the statistical benefits of CCNNs, while enjoying a more formal and efficient workflow.

下载PDF全文

下载文献需遵守相关版权规定

论文标题