深层神经网络的压缩策略和空间意识表示

论文标题

深层神经网络的压缩策略和空间意识表示

Compression strategies and space-conscious representations for deep neural networks

论文作者

Marinò, Giosuè Cataldo, Ghidoli, Gregorio, Frasca, Marco, Malchiodi, Dario

论文摘要

深度学习的最新进展使大型，强大的卷积神经网络（CNN）在多个现实世界应用中具有最先进的性能。不幸的是，这些大型型号具有数百万个参数，因此它们无法在资源有限的平台上部署（例如，RAM受到限制）。因此，CNN的压缩成为实现记忆效率和可能更快的模型表示的关键问题。在本文中，我们通过重量修剪和量化对CNN的有损压缩的影响以及基于源编码的无损重量矩阵表示。我们在四个基准数据集上测试了这些技术的几种组合，以解决分类和回归问题，从而达到了高达165美元的压缩率，同时保留或改善了模型性能。

Recent advances in deep learning have made available large, powerful convolutional neural networks (CNN) with state-of-the-art performance in several real-world applications. Unfortunately, these large-sized models have millions of parameters, thus they are not deployable on resource-limited platforms (e.g. where RAM is limited). Compression of CNNs thereby becomes a critical problem to achieve memory-efficient and possibly computationally faster model representations. In this paper, we investigate the impact of lossy compression of CNNs by weight pruning and quantization, and lossless weight matrix representations based on source coding. We tested several combinations of these techniques on four benchmark datasets for classification and regression problems, achieving compression rates up to $165$ times, while preserving or improving the model performance.

下载PDF全文

下载文献需遵守相关版权规定

论文标题