论文标题

输入,量化和微调:神经网络的有效压缩

Permute, Quantize, and Fine-tune: Efficient Compression of Neural Networks

论文作者

Martinez, Julieta, Shewakramani, Jashan, Liu, Ting Wei, Bârsan, Ioan Andrei, Zeng, Wenyuan, Urtasun, Raquel

论文摘要

压缩大型神经网络是其在资源受限的计算平台中部署的重要步骤。在这种情况下,向量量化是一个吸引人的框架,它使用单个代码表达多个参数,并且最近在一系列核心视觉和自然语言处理任务上实现了最新的网络压缩。矢量量化成功的关键是确定应将哪些参数组压缩在一起。先前的工作依赖于启发式方法,即分组单个卷积过滤器的空间维度,但一般解决方案仍然没有得到解决。这对于重点卷积(主导现代体系结构),线性层(没有空间维度的概念)和卷积(当将多个过滤器压缩到同一代码字时)是可取的。在本文中,我们观察到可以在表达相同功能的同时置换两个相邻层的权重。然后,我们建立了与利率延伸理论的联系,并搜索导致更易于压缩的网络的排列。最后,我们依靠一种退火量化算法来更好地压缩网络并实现更高的最终精度。我们显示了图像分类,对象检测和分割的结果,相对于当前的技术状态,使用未压缩模型的差距减少了40%至70%。

Compressing large neural networks is an important step for their deployment in resource-constrained computational platforms. In this context, vector quantization is an appealing framework that expresses multiple parameters using a single code, and has recently achieved state-of-the-art network compression on a range of core vision and natural language processing tasks. Key to the success of vector quantization is deciding which parameter groups should be compressed together. Previous work has relied on heuristics that group the spatial dimension of individual convolutional filters, but a general solution remains unaddressed. This is desirable for pointwise convolutions (which dominate modern architectures), linear layers (which have no notion of spatial dimension), and convolutions (when more than one filter is compressed to the same codeword). In this paper we make the observation that the weights of two adjacent layers can be permuted while expressing the same function. We then establish a connection to rate-distortion theory and search for permutations that result in networks that are easier to compress. Finally, we rely on an annealed quantization algorithm to better compress the network and achieve higher final accuracy. We show results on image classification, object detection, and segmentation, reducing the gap with the uncompressed model by 40 to 70% with respect to the current state of the art.

扫码加入交流群

加入微信交流群

微信交流群二维码

扫码加入学术交流群,获取更多资源