论文标题
使用全局和本地传输模块进行交互式视频对象分割
Interactive Video Object Segmentation Using Global and Local Transfer Modules
论文作者
论文摘要
本文提出了一种交互式视频对象分割算法,该算法是在查询对象上作为输入的涂鸦注释。我们开发了一个深神网络,该网络由注释网络(A-NET)和转移网络(T-NET)组成。首先,给定用户在帧上涂抹,A-NET基于编码器架构会产生分割结果。其次,T-NET通过采用全局和局部传输模块,将分割结果双向转移到其他帧。全局传输模块将带注释的框架中的分割信息传达到目标框架中,而本地传输模块将分割信息传播到目标框架的时间相邻框架中。通过交替应用A-NET和T-NET,用户可以通过最小的努力获得所需的分割结果。我们通过模拟用户涂鸦并采用辅助损失来分两个阶段训练整个网络。实验结果表明,提出的交互式视频对象分割算法的表现优于最先进的常规算法。代码和模型可在https://github.com/yuk6heo/ivos-atnet上找到。
An interactive video object segmentation algorithm, which takes scribble annotations on query objects as input, is proposed in this paper. We develop a deep neural network, which consists of the annotation network (A-Net) and the transfer network (T-Net). First, given user scribbles on a frame, A-Net yields a segmentation result based on the encoder-decoder architecture. Second, T-Net transfers the segmentation result bidirectionally to the other frames, by employing the global and local transfer modules. The global transfer module conveys the segmentation information in an annotated frame to a target frame, while the local transfer module propagates the segmentation information in a temporally adjacent frame to the target frame. By applying A-Net and T-Net alternately, a user can obtain desired segmentation results with minimal efforts. We train the entire network in two stages, by emulating user scribbles and employing an auxiliary loss. Experimental results demonstrate that the proposed interactive video object segmentation algorithm outperforms the state-of-the-art conventional algorithms. Codes and models are available at https://github.com/yuk6heo/IVOS-ATNet.