探索无监督域自适应语义细分的高质量目标域信息

论文标题

探索无监督域自适应语义细分的高质量目标域信息

Exploring High-quality Target Domain Information for Unsupervised Domain Adaptive Semantic Segmentation

论文作者

Li, Junjie, Wang, Zilei, Gao, Yuan, Hu, Xiaoming

论文摘要

在无监督的域自适应（UDA）语义分割中，基于蒸馏的方法目前在性能上占主导地位。但是，蒸馏技术需要使多阶段的过程和许多培训技巧复杂化。在本文中，我们提出了一种简单而有效的方法，可以为高级蒸馏方法实现竞争性能。我们的核心思想是从边界和功能的视图中充分探索目标域信息。首先，我们提出了一种新型的混合策略，以产生具有地面标签的高质量目标域边界。与以前的工作中的源域边界不同，我们选择了高信心目标域区域，然后将其粘贴到源域图像。这样的策略可以使用正确的标签在目标域（目标域对象区域的边缘）中生成对象边界。因此，可以通过学习混合样品来有效地捕获目标域的边界信息。其次，我们设计了多层对比损失，以改善目标域数据的表示，包括像素级和原型级对比度学习。通过结合两种建议的方法，可以提取更多的判别特征，并且可以更好地解决目标域的硬对象边界。在两个常用基准（\ textit {i.e。}，gta5 $ \ rightarrow $ cityScapes和synthia $ \ rightarrow $ cityScapes）上的实验结果表明，我们的方法可以在复杂的蒸馏方法上实现竞争性能。值得注意的是，对于synthia $ \ rightarrow $ cityScapes方案，我们的方法以$ 57.8 \％$ MIOU和$ 64.6 \％$ $ MIOU在16堂课和13堂课上实现最先进的性能。代码可在https://github.com/ljjcoder/ehtdi上找到。

In unsupervised domain adaptive (UDA) semantic segmentation, the distillation based methods are currently dominant in performance. However, the distillation technique requires complicate multi-stage process and many training tricks. In this paper, we propose a simple yet effective method that can achieve competitive performance to the advanced distillation methods. Our core idea is to fully explore the target-domain information from the views of boundaries and features. First, we propose a novel mix-up strategy to generate high-quality target-domain boundaries with ground-truth labels. Different from the source-domain boundaries in previous works, we select the high-confidence target-domain areas and then paste them to the source-domain images. Such a strategy can generate the object boundaries in target domain (edge of target-domain object areas) with the correct labels. Consequently, the boundary information of target domain can be effectively captured by learning on the mixed-up samples. Second, we design a multi-level contrastive loss to improve the representation of target-domain data, including pixel-level and prototype-level contrastive learning. By combining two proposed methods, more discriminative features can be extracted and hard object boundaries can be better addressed for the target domain. The experimental results on two commonly adopted benchmarks (\textit{i.e.}, GTA5 $\rightarrow$ Cityscapes and SYNTHIA $\rightarrow$ Cityscapes) show that our method achieves competitive performance to complicated distillation methods. Notably, for the SYNTHIA$\rightarrow$ Cityscapes scenario, our method achieves the state-of-the-art performance with $57.8\%$ mIoU and $64.6\%$ mIoU on 16 classes and 13 classes. Code is available at https://github.com/ljjcoder/EHTDI.

下载PDF全文

下载文献需遵守相关版权规定

论文标题