明确建模图像分类的注意图

论文标题

明确建模图像分类的注意图

Explicitly Modeled Attention Maps for Image Classification

论文作者

Tan, Andong, Nguyen, Duc Tam, Dax, Maximilian, Nießner, Matthias, Brox, Thomas

论文摘要

自我发挥的网络在计算机视觉任务（例如图像分类）中表现出了显着的进展。自我发挥机制的主要好处是能够在注意图中捕获远程特征相互作用。但是，注意映射的计算需要一个可学习的键，查询和位置编码，其用法通常不直观且计算昂贵。为了减轻此问题，我们提出了一个新型的自我发场模块，并使用仅使用单个可学习参数用于低计算开销的明确建模的注意图。使用几何学先验的显式建模注意图的设计是基于这样的观察，即图像中给定像素的空间上下文主要由其邻居主导，而更遥远的像素则具有较小的贡献。具体而言，注意图是通过可学习半径的简单函数（例如高斯内核）进行了参数化的，该功能与输入内容独立于建模。我们的评估表明，我们的方法在Imagenet ILSVRC中的准确性提高了2.2％的准确性高达2.2％，并且胜过其他自我注意力的方法，例如AA-Resnet152的准确性，准确度的准确性提高了0.9％，而少于6.4％的参数，少于6.7％。从经验上，该结果表明将几何事先提前纳入图像分类时，将几何事先提前纳入自我注意力的机制。

Self-attention networks have shown remarkable progress in computer vision tasks such as image classification. The main benefit of the self-attention mechanism is the ability to capture long-range feature interactions in attention-maps. However, the computation of attention-maps requires a learnable key, query, and positional encoding, whose usage is often not intuitive and computationally expensive. To mitigate this problem, we propose a novel self-attention module with explicitly modeled attention-maps using only a single learnable parameter for low computational overhead. The design of explicitly modeled attention-maps using geometric prior is based on the observation that the spatial context for a given pixel within an image is mostly dominated by its neighbors, while more distant pixels have a minor contribution. Concretely, the attention-maps are parametrized via simple functions (e.g., Gaussian kernel) with a learnable radius, which is modeled independently of the input content. Our evaluation shows that our method achieves an accuracy improvement of up to 2.2% over the ResNet-baselines in ImageNet ILSVRC and outperforms other self-attention methods such as AA-ResNet152 in accuracy by 0.9% with 6.4% fewer parameters and 6.7% fewer GFLOPs. This result empirically indicates the value of incorporating geometric prior into self-attention mechanism when applied in image classification.

下载PDF全文

下载文献需遵守相关版权规定

论文标题