论文标题
视频中的几何特征多人的人类对象互动识别
Geometric Features Informed Multi-person Human-object Interaction Recognition in Videos
论文作者
论文摘要
视频中的人类对象相互作用(HOI)识别对于分析人类活动很重要。在现实世界中,大多数关注视觉特征的工作通常都会受到阻塞。当HOI涉及多个人和物体时,此类问题将更加复杂。考虑到人类姿势和对象位置等几何特征提供了有意义的信息来了解HOI,我们认为将视觉和几何特征在HOI识别中的好处结合在一起,并提出了一种新颖的两级几何形状特征信息信息图形卷积网络(2G-GCN)。几何级图模拟了人类和对象的几何特征之间的相互依存关系,而融合级别的图将它们与人类和对象的视觉特征融合在一起。为了证明我们方法在挑战性场景中的新颖性和有效性,我们提出了一个新的多人HOI数据集(Mphoi-72)。关于Mphoi-72(多人HOI),CAD-1220(单人HOI)和双义动作(双手HOI)数据集的广泛实验证明了我们的表现与最先进的表现相比。
Human-Object Interaction (HOI) recognition in videos is important for analyzing human activity. Most existing work focusing on visual features usually suffer from occlusion in the real-world scenarios. Such a problem will be further complicated when multiple people and objects are involved in HOIs. Consider that geometric features such as human pose and object position provide meaningful information to understand HOIs, we argue to combine the benefits of both visual and geometric features in HOI recognition, and propose a novel Two-level Geometric feature-informed Graph Convolutional Network (2G-GCN). The geometric-level graph models the interdependency between geometric features of humans and objects, while the fusion-level graph further fuses them with visual features of humans and objects. To demonstrate the novelty and effectiveness of our method in challenging scenarios, we propose a new multi-person HOI dataset (MPHOI-72). Extensive experiments on MPHOI-72 (multi-person HOI), CAD-120 (single-human HOI) and Bimanual Actions (two-hand HOI) datasets demonstrate our superior performance compared to state-of-the-arts.