论文标题
Zoom-Cam:从图像标签中生成细粒的像素注释
Zoom-CAM: Generating Fine-grained Pixel Annotations from Image Labels
论文作者
论文摘要
当前的弱监督对象定位和分割依赖于类歧视性可视化技术来生成用于像素级训练的伪标记。这种可视化方法,包括类激活映射(CAM)和GRAD-CAM,仅使用最深,最低的卷积层,缺少中间层中的所有信息。我们提出了Zoom-CAM:通过在中间层中的所有激活中整合重要性图,超越了最后一个最低的分辨率层。 Zoom-CAM捕获了各种判别类实例的细粒小尺度对象,这些对象通常被基线可视化方法遗漏。我们专注于从类标签生成像素级伪标记。在ImageNet定位任务上评估的伪标签的质量在TOP-1误差方面提高了2.8%以上。对于弱监督的语义分割,我们产生的伪标签将最新模型的状态提高了1.1%。
Current weakly supervised object localization and segmentation rely on class-discriminative visualization techniques to generate pseudo-labels for pixel-level training. Such visualization methods, including class activation mapping (CAM) and Grad-CAM, use only the deepest, lowest resolution convolutional layer, missing all information in intermediate layers. We propose Zoom-CAM: going beyond the last lowest resolution layer by integrating the importance maps over all activations in intermediate layers. Zoom-CAM captures fine-grained small-scale objects for various discriminative class instances, which are commonly missed by the baseline visualization methods. We focus on generating pixel-level pseudo-labels from class labels. The quality of our pseudo-labels evaluated on the ImageNet localization task exhibits more than 2.8% improvement on top-1 error. For weakly supervised semantic segmentation our generated pseudo-labels improve a state of the art model by 1.1%.