全局上下文意识到对象检测的RCNN

论文标题

全局上下文意识到对象检测的RCNN

Global Context Aware RCNN for Object Detection

论文作者

Zhang, Wenchao, Fu, Chong, Xie, Haoyu, Zhu, Mai, Tie, Ming, Chen, Junxin

论文摘要

Roipool/Roialign是典型的两阶段对象检测算法的必不可少的过程，它用于重新生成从特征金字塔裁剪的对象建议，以生成固定尺寸的特征映射。但是，这些本地接收场的裁剪特征地图将严重失去全球环境信息。为了解决这个问题，我们提出了一个新颖的端到端可训练框架，称为全球环境意识（GCA）RCNN，旨在通过融合全球环境信息来帮助神经网络加强背景与前景之间的空间相关性。我们GCA框架的核心组成部分是一种上下文意识机制，其中全球特征金字塔和注意力策略分别用于特征提取和特征改进。具体而言，我们利用密集的连接来改善FPN自上而下过程中不同阶段的全局环境的信息流，并进一步使用注意机制来完善特征金字塔中每个级别的全局环境。最后，我们还提出了方法的轻量级版本，该版本只会稍微增加模型的复杂性和计算负担。可可基准数据集的实验结果证明了我们方法的显着优势。

RoIPool/RoIAlign is an indispensable process for the typical two-stage object detection algorithm, it is used to rescale the object proposal cropped from the feature pyramid to generate a fixed size feature map. However, these cropped feature maps of local receptive fields will heavily lose global context information. To tackle this problem, we propose a novel end-to-end trainable framework, called Global Context Aware (GCA) RCNN, aiming at assisting the neural network in strengthening the spatial correlation between the background and the foreground by fusing global context information. The core component of our GCA framework is a context aware mechanism, in which both global feature pyramid and attention strategies are used for feature extraction and feature refinement, respectively. Specifically, we leverage the dense connection to improve the information flow of the global context at different stages in the top-down process of FPN, and further use the attention mechanism to refine the global context at each level in the feature pyramid. In the end, we also present a lightweight version of our method, which only slightly increases model complexity and computational burden. Experimental results on COCO benchmark dataset demonstrate the significant advantages of our approach.

下载PDF全文

下载文献需遵守相关版权规定

论文标题