转移和正规化语义细分的预测

论文标题

转移和正规化语义细分的预测

Transferring and Regularizing Prediction for Semantic Segmentation

论文作者

Zhang, Yiheng, Qiu, Zhaofan, Yao, Ting, Ngo, Chong-Wah, Liu, Dong, Mei, Tao

论文摘要

语义分割通常需要具有像素级注释的大量图像。鉴于非常昂贵的专家标签，最近的研究表明，经过计算机生成的注释的照片真实综合数据（例如，计算机游戏）培训的模型可以适应真实图像。尽管取得了这种进展，但由于严重的域不匹配，模型在不限制对真实图像的预测的情况下会很容易地过度拟合。在本文中，我们在微小利用语义分割的内在特性来减轻模型转移的此类问题。具体而言，我们提出了一个预测转移（RPT）的正规化程序，该规则可以将固有属性施加为以无监督的方式正规化模型转移的约束。这些约束包括贴片级，群集级和上下文级的语义预测在不同级别的图像形成级别。由于传输不含标签和数据驱动，因此通过选择性地涉及模型正则化图像区域的子集来解决预测的鲁棒性。进行了广泛的实验，以验证RPT对在GTA5和合成数据（合成数据）转移到CityScapes数据集（Urban Street Scenes）的模型的转移方面的提议。当对几个神经网络的限制进行语义分割时，RPT显示出一致的改进。更值得注意的是，当将RPT整合到基于对抗性的分割框架中时，我们将报告最佳结果：MIOU分别从GTA5/Synthia转移到CityScapes时为53.2％/51.7％。

Semantic segmentation often requires a large set of images with pixel-level annotations. In the view of extremely expensive expert labeling, recent research has shown that the models trained on photo-realistic synthetic data (e.g., computer games) with computer-generated annotations can be adapted to real images. Despite this progress, without constraining the prediction on real images, the models will easily overfit on synthetic data due to severe domain mismatch. In this paper, we novelly exploit the intrinsic properties of semantic segmentation to alleviate such problem for model transfer. Specifically, we present a Regularizer of Prediction Transfer (RPT) that imposes the intrinsic properties as constraints to regularize model transfer in an unsupervised fashion. These constraints include patch-level, cluster-level and context-level semantic prediction consistencies at different levels of image formation. As the transfer is label-free and data-driven, the robustness of prediction is addressed by selectively involving a subset of image regions for model regularization. Extensive experiments are conducted to verify the proposal of RPT on the transfer of models trained on GTA5 and SYNTHIA (synthetic data) to Cityscapes dataset (urban street scenes). RPT shows consistent improvements when injecting the constraints on several neural networks for semantic segmentation. More remarkably, when integrating RPT into the adversarial-based segmentation framework, we report to-date the best results: mIoU of 53.2%/51.7% when transferring from GTA5/SYNTHIA to Cityscapes, respectively.

下载PDF全文

下载文献需遵守相关版权规定

论文标题