跨粒度动作识别的深层层次合并设计

论文标题

跨粒度动作识别的深层层次合并设计

Deep hierarchical pooling design for cross-granularity action recognition

论文作者

Mazari, Ahmed, Sahbi, Hichem

论文摘要

在本文中，我们介绍了一种新型的分层聚合设计，该设计在动作识别中捕获了不同水平的时间粒度。我们的设计原理是粗到精细的，并使用树结构网络实现了。当我们自上而下时，当我们穿越该网络时，汇总操作的不变性越来越少，但及时坚决且本地化。通过解决一个约束的最小化问题，可以获得该网络中最适合给定基础的操作的组合 - 最适合给定的地面真实情况，该问题的解决方案对应于捕获全球层次结构池过程中每个级别（及其时间粒度）贡献的权重分布。除了被原则性和扎根于原则性和扎根外，提议的层次合并也是视频长度的不可知论，并且对动作的未对准有弹性。对挑战性UCF-101数据库进行的广泛实验证实了这些陈述。

In this paper, we introduce a novel hierarchical aggregation design that captures different levels of temporal granularity in action recognition. Our design principle is coarse-to-fine and achieved using a tree-structured network; as we traverse this network top-down, pooling operations are getting less invariant but timely more resolute and well localized. Learning the combination of operations in this network -- which best fits a given ground-truth -- is obtained by solving a constrained minimization problem whose solution corresponds to the distribution of weights that capture the contribution of each level (and thereby temporal granularity) in the global hierarchical pooling process. Besides being principled and well grounded, the proposed hierarchical pooling is also video-length agnostic and resilient to misalignments in actions. Extensive experiments conducted on the challenging UCF-101 database corroborate these statements.

下载PDF全文

下载文献需遵守相关版权规定

论文标题