论文标题
BlendMask:自上而下的人会遇到自下而上
BlendMask: Top-Down Meets Bottom-Up for Instance Segmentation
论文作者
论文摘要
实例细分是基本愿景任务之一。最近,完全卷积实例分割方法引起了很多关注,因为它们通常比蒙版R-CNN更简单,更有效。迄今为止,当模型具有相似的计算复杂性时,几乎所有此类方法都落后于两个阶段的蒙版R-CNN方法,留下了很大的改进空间。 在这项工作中,我们通过将实例级别的信息与语义信息与较低级别的细粒度相结合,从而实现了改进的掩码预测。我们的主要贡献是搅拌机模块,它从自上而下和自下而上的实例分段方法中汲取灵感。所提出的混合板可以有效地预测具有很少通道的每个像素位置敏感实例特征,并仅使用一个卷积层学习每个实例的注意图,从而快速推断。 BlendMask可以轻松地与最先进的一阶段检测框架合并,并且在相同的培训时间表下超过了蒙版R-CNN,同时又快20%。 BlendMask的轻巧版本以25 fps的价格获得了$ 34.2%的$ MAP,该地图在单个1080TI GPU卡上评估。由于它的简单性和功效,我们希望我们的BlendMask可以作为整个实例预测任务的简单而强大的基线。 代码可从https://git.io/adelaidet获得
Instance segmentation is one of the fundamental vision tasks. Recently, fully convolutional instance segmentation methods have drawn much attention as they are often simpler and more efficient than two-stage approaches like Mask R-CNN. To date, almost all such approaches fall behind the two-stage Mask R-CNN method in mask precision when models have similar computation complexity, leaving great room for improvement. In this work, we achieve improved mask prediction by effectively combining instance-level information with semantic information with lower-level fine-granularity. Our main contribution is a blender module which draws inspiration from both top-down and bottom-up instance segmentation approaches. The proposed BlendMask can effectively predict dense per-pixel position-sensitive instance features with very few channels, and learn attention maps for each instance with merely one convolution layer, thus being fast in inference. BlendMask can be easily incorporated with the state-of-the-art one-stage detection frameworks and outperforms Mask R-CNN under the same training schedule while being 20% faster. A light-weight version of BlendMask achieves $ 34.2% $ mAP at 25 FPS evaluated on a single 1080Ti GPU card. Because of its simplicity and efficacy, we hope that our BlendMask could serve as a simple yet strong baseline for a wide range of instance-wise prediction tasks. Code is available at https://git.io/AdelaiDet