分析机器对机器通信的神经图像压缩网络

论文标题

分析机器对机器通信的神经图像压缩网络

Analysis of Neural Image Compression Networks for Machine-to-Machine Communication

论文作者

Fischer, Kristian, Forsch, Christian, Herglotz, Christian, Kaup, André

论文摘要

计算机（VCM）的视频和图像编码是一个新兴领域，旨在开发压缩方法，当通过神经网络分析解码的帧时，导致最佳的Bitstreams。已经存在几种方法，可以改善经典的混合编解码器来完成此任务。但是，在过去几年中，神经压缩网络（NCN）在编码图像方面取得了巨大进展。因此，当信息沉入侧面的信息也是神经网络时，考虑此类NCN是合理的。因此，我们建立了一个评估框架，分析了蒙版R-CNN从解码的图像分割对象时，分析了四个最先进的NCN的性能。压缩性能通过CityScapes数据集的加权平均精度来衡量。基于该分析，我们发现具有泄漏的网络作为非线性性和用SSIM作为失真标准的培训导致VCM任务的编码增长最高。此外，结果表明，基于GAN的NCN体系结构可实现最佳的编码性能，甚至超出了给定情况的最近标准化的Versatile视频编码（VVC）。

Video and image coding for machines (VCM) is an emerging field that aims to develop compression methods resulting in optimal bitstreams when the decoded frames are analyzed by a neural network. Several approaches already exist improving classic hybrid codecs for this task. However, neural compression networks (NCNs) have made an enormous progress in coding images over the last years. Thus, it is reasonable to consider such NCNs, when the information sink at the decoder side is a neural network as well. Therefore, we build-up an evaluation framework analyzing the performance of four state-of-the-art NCNs, when a Mask R-CNN is segmenting objects from the decoded image. The compression performance is measured by the weighted average precision for the Cityscapes dataset. Based on that analysis, we find that networks with leaky ReLU as non-linearity and training with SSIM as distortion criteria results in the highest coding gains for the VCM task. Furthermore, it is shown that the GAN-based NCN architecture achieves the best coding performance and even out-performs the recently standardized Versatile Video Coding (VVC) for the given scenario.

下载PDF全文

下载文献需遵守相关版权规定

论文标题