从注意力中学习亲和力：端到端弱监督的语义分割与变压器

论文标题

从注意力中学习亲和力：端到端弱监督的语义分割与变压器

Learning Affinity from Attention: End-to-End Weakly-Supervised Semantic Segmentation with Transformers

论文作者

Ru, Lixiang, Zhan, Yibing, Yu, Baosheng, Du, Bo

论文摘要

带有图像级标签的弱监督语义细分（WSSS）是一项重要且具有挑战性的任务。由于高训练效率，WSS的端到端解决方案受到了社区的越来越多的关注。但是，当前方法主要基于卷积神经网络，无法正确探索全局信息，因此通常会导致不完整的对象区域。在本文中，为了解决上述问题，我们介绍了自然整合全局信息的变形金刚，以生成更具不可或缺的初始伪标签，以用于端到端的WSSS。由变压器和语义亲和力之间的自我注意力之间的固有一致性激发，我们提出了来自注意力（AFA）模块的亲和力，以从变形金刚中的多头自我注意力（MHSA）中学习语义亲和力。然后将学习的亲和力借用以完善初始伪标签以进行分割。此外，为了有效地得出可靠的亲和力标签，用于监督AFA并确保伪标签的局部一致性，我们设计了一个像素自适应修复模块，该模块包含低级图像外观信息以完善伪标签。我们进行了广泛的实验，我们的方法在Pascal VOC 2012和MS Coco 2014数据集中获得了66.0％和38.9％的MIOU，大大优于最近的端到端方法和几个多阶段竞争者。代码可在https://github.com/rulixiang/afa上找到。

Weakly-supervised semantic segmentation (WSSS) with image-level labels is an important and challenging task. Due to the high training efficiency, end-to-end solutions for WSSS have received increasing attention from the community. However, current methods are mainly based on convolutional neural networks and fail to explore the global information properly, thus usually resulting in incomplete object regions. In this paper, to address the aforementioned problem, we introduce Transformers, which naturally integrate global information, to generate more integral initial pseudo labels for end-to-end WSSS. Motivated by the inherent consistency between the self-attention in Transformers and the semantic affinity, we propose an Affinity from Attention (AFA) module to learn semantic affinity from the multi-head self-attention (MHSA) in Transformers. The learned affinity is then leveraged to refine the initial pseudo labels for segmentation. In addition, to efficiently derive reliable affinity labels for supervising AFA and ensure the local consistency of pseudo labels, we devise a Pixel-Adaptive Refinement module that incorporates low-level image appearance information to refine the pseudo labels. We perform extensive experiments and our method achieves 66.0% and 38.9% mIoU on the PASCAL VOC 2012 and MS COCO 2014 datasets, respectively, significantly outperforming recent end-to-end methods and several multi-stage competitors. Code is available at https://github.com/rulixiang/afa.

下载PDF全文

下载文献需遵守相关版权规定

论文标题