论文标题
通用的多频图像压缩的通用八度卷积
Generalized Octave Convolutions for Learned Multi-Frequency Image Compression
论文作者
论文摘要
学习的图像压缩最近显示了超越标准编解码器的潜力。通过上下文自适应的熵编码方法,已经实现了最先进的速率 - 延伸性(R-D)性能,在这种方法中,高位和自回归模型共同利用了有效捕获潜在表示中的空间依赖性。但是,潜伏在以前的作品中具有相同空间分辨率的特征地图,其中包含一些影响R-D性能的冗余。在本文中,我们提出了第一个学到的多频图像压缩和熵编码方法,该方法基于最近开发的八度响应量,将潜在的量表分解为高和低频(分辨率)组件,其中低频由较低的分辨率表示。因此,其空间冗余降低,从而提高了R-D性能。还提出了具有内部激活层的新型广义八度卷积和八度式转换构造,以保留信息的更多空间结构。实验结果表明,所提出的方案不仅胜过所有现有的学习方法,以及在PSNR和MS-SSIM中Kodak数据集上的下一代视频编码标准VVC(4:2:0)等标准编解码器。我们还表明,所提出的广义八度卷积可以提高基于自动编码器的计算机视觉任务任务(例如语义细分和图像denoising)的性能。
Learned image compression has recently shown the potential to outperform the standard codecs. State-of-the-art rate-distortion (R-D) performance has been achieved by context-adaptive entropy coding approaches in which hyperprior and autoregressive models are jointly utilized to effectively capture the spatial dependencies in the latent representations. However, the latents are feature maps of the same spatial resolution in previous works, which contain some redundancies that affect the R-D performance. In this paper, we propose the first learned multi-frequency image compression and entropy coding approach that is based on the recently developed octave convolutions to factorize the latents into high and low frequency (resolution) components, where the low frequency is represented by a lower resolution. Therefore, its spatial redundancy is reduced, which improves the R-D performance. Novel generalized octave convolution and octave transposed-convolution architectures with internal activation layers are also proposed to preserve more spatial structure of the information. Experimental results show that the proposed scheme not only outperforms all existing learned methods as well as standard codecs such as the next-generation video coding standard VVC (4:2:0) on the Kodak dataset in both PSNR and MS-SSIM. We also show that the proposed generalized octave convolution can improve the performance of other auto-encoder-based computer vision tasks such as semantic segmentation and image denoising.