融合：运动姿势估计的运动学空间中的IMU视觉传感器融合

论文标题

融合：运动姿势估计的运动学空间中的IMU视觉传感器融合

FusePose: IMU-Vision Sensor Fusion in Kinematic Space for Parametric Human Pose Estimation

论文作者

Bao, Yiming, Zhao, Xu, Qian, Dahong

论文摘要

3D人类姿势估计任务中存在挑战性的问题，例如由遮挡和自我封闭引起的性能差。最近，IMU-Vision传感器融合被认为对于解决这些问题很有价值。但是，先前关于IMU和视力数据的融合的研究（异质性）无法充分利用IMU原始数据或可靠的高级视力特征。为了促进更有效的传感器融合，在这项工作中，我们提出了一个在参数人运动模型下的框架，称为\ emph {fusepose}。具体而言，我们汇总了IMU或视觉数据的不同信息，并引入了三种独特的传感器融合方法：NaiveFuse，Kinefuse和AdadeEpfuse。 NaiveFuse服务器是一种基本方法，仅融合简化的IMU数据并估计欧几里得空间中的3D姿势。在运动空间中，KineFuse能够将校准和对齐的IMU原始数据与转换后的3D姿势参数集成在一起。 AdadeEpfuse进一步发展了这种运动学融合过程，以一种适应性和端到端的训练方式。进行消融研究的全面实验表明了所提出的框架的合理性和优越性。与基线结果相比，3D人姿势估计的性能得到了提高。在Total Capture数据集上，Kinefuse超过了先前的最新技术，该最新仅用于测试8.6 \％。 AdadeEpfuse超过了最新的，该最新方法将IMU用于培训和测试的最新时间为8.5 \％。此外，我们通过对人类360万数据集的实验来验证框架的概括能力。

There exist challenging problems in 3D human pose estimation mission, such as poor performance caused by occlusion and self-occlusion. Recently, IMU-vision sensor fusion is regarded as valuable for solving these problems. However, previous researches on the fusion of IMU and vision data, which is heterogeneous, fail to adequately utilize either IMU raw data or reliable high-level vision features. To facilitate a more efficient sensor fusion, in this work we propose a framework called \emph{FusePose} under a parametric human kinematic model. Specifically, we aggregate different information of IMU or vision data and introduce three distinctive sensor fusion approaches: NaiveFuse, KineFuse and AdaDeepFuse. NaiveFuse servers as a basic approach that only fuses simplified IMU data and estimated 3D pose in euclidean space. While in kinematic space, KineFuse is able to integrate the calibrated and aligned IMU raw data with converted 3D pose parameters. AdaDeepFuse further develops this kinematical fusion process to an adaptive and end-to-end trainable manner. Comprehensive experiments with ablation studies demonstrate the rationality and superiority of the proposed framework. The performance of 3D human pose estimation is improved compared to the baseline result. On Total Capture dataset, KineFuse surpasses previous state-of-the-art which uses IMU only for testing by 8.6\%. AdaDeepFuse surpasses state-of-the-art which uses IMU for both training and testing by 8.5\%. Moreover, we validate the generalization capability of our framework through experiments on Human3.6M dataset.

下载PDF全文

下载文献需遵守相关版权规定

论文标题