论文标题
端到端的人类姿势和网状重建具有变压器
End-to-End Human Pose and Mesh Reconstruction with Transformers
论文作者
论文摘要
我们提出了一种称为网状变压器(Metro)的新方法,以从单个图像中重建3D人姿势和网格顶点。我们的方法使用变压器编码器共同模拟顶点vertex和顶点 - 连接相互作用,并同时输出3D关节坐标和网格顶点。与回归姿势和形状参数的现有技术相比,Metro不依赖于SMPL(例如SMPL)的任何参数网格模型,因此可以很容易地将其扩展到其他对象(例如Hands)。我们进一步放松了网格拓扑,并允许变压器自我发项机制在任意两个顶点之间自由参加,从而可以学习网格顶点和关节之间的非本地关系。借助提出的蒙版顶点建模,我们的方法在处理诸如部分遮挡之类的具有挑战性的情况方面更加健壮和有效。 Metro在公共人类36M和3DPW数据集上为人类网格重建产生新的最新结果。此外,我们证明了Metro对野外3D手重建的普遍性,在Freihand数据集上表现优于现有的最新方法。代码和预训练模型可在https://github.com/microsoft/meshtransformer上找到。
We present a new method, called MEsh TRansfOrmer (METRO), to reconstruct 3D human pose and mesh vertices from a single image. Our method uses a transformer encoder to jointly model vertex-vertex and vertex-joint interactions, and outputs 3D joint coordinates and mesh vertices simultaneously. Compared to existing techniques that regress pose and shape parameters, METRO does not rely on any parametric mesh models like SMPL, thus it can be easily extended to other objects such as hands. We further relax the mesh topology and allow the transformer self-attention mechanism to freely attend between any two vertices, making it possible to learn non-local relationships among mesh vertices and joints. With the proposed masked vertex modeling, our method is more robust and effective in handling challenging situations like partial occlusions. METRO generates new state-of-the-art results for human mesh reconstruction on the public Human3.6M and 3DPW datasets. Moreover, we demonstrate the generalizability of METRO to 3D hand reconstruction in the wild, outperforming existing state-of-the-art methods on FreiHAND dataset. Code and pre-trained models are available at https://github.com/microsoft/MeshTransformer.