论文标题

汇编++:通过注意连接组装模式表示

AssembleNet++: Assembling Modality Representations via Attention Connections

论文作者

Ryoo, Michael S., Piergiovanni, AJ, Kangaspunta, Juhana, Angelova, Anelia

论文摘要

我们创建了一个功能强大的视频模型家族,能够:(i)学习语义对象信息与原始外观和运动功能之间的互动,以及(ii)部署注意力以更好地了解网络每个卷积块的功能的重要性。引入了一个名为同行注意的新网络组件,该组件使用其他块或输入模态动态学习注意力权重。即使没有预培训,我们的模型也优于先前与连续视频的标准公共活动识别数据集上的工作,从而建立新的最先进。我们还确认,我们从对象形态具有神经连接的发现和同行注意的使用通常适用于不同的现有架构,从而改善了其性能。我们将模型命名为“汇编++”。该代码将在以下网址提供:https://sites.google.com/corp/view/assemblenet/

We create a family of powerful video models which are able to: (i) learn interactions between semantic object information and raw appearance and motion features, and (ii) deploy attention in order to better learn the importance of features at each convolutional block of the network. A new network component named peer-attention is introduced, which dynamically learns the attention weights using another block or input modality. Even without pre-training, our models outperform the previous work on standard public activity recognition datasets with continuous videos, establishing new state-of-the-art. We also confirm that our findings of having neural connections from the object modality and the use of peer-attention is generally applicable for different existing architectures, improving their performances. We name our model explicitly as AssembleNet++. The code will be available at: https://sites.google.com/corp/view/assemblenet/

扫码加入交流群

加入微信交流群

微信交流群二维码

扫码加入学术交流群,获取更多资源