Mime：人类意识3D场景一代

论文标题

Mime：人类意识3D场景一代

MIME: Human-Aware 3D Scene Generation

论文作者

Yi, Hongwei, Huang, Chun-Hao P., Tripathi, Shashank, Hering, Lea, Thies, Justus, Black, Michael J.

论文摘要

在运动，体系结构和合成数据创建中生成现实的3D世界有许多应用程序。但是产生这样的场景是昂贵且劳动力密集的。鉴于3D场景，最近的工作产生了人类的姿势和动议。在这里，我们采用相反的方法，并在3D人体运动中产生3D室内场景。这样的动作可能来自档案动作捕获，也可以来自戴在体内的IMU传感器，从而有效地将人类运动转向了3D世界的“扫描仪”。直觉上，人类运动表明房间中的自由空间，人类接触表示支持坐着，说谎或触摸等活动的表面或物体。我们提出了MIME（采矿相互作用和运动来推断3D环境），这是室内场景的生成模型，它会产生与人类运动一致的家具布局。 Mime使用自动回归变压器体系结构，该体系结构将现场已经生成的对象以及人类运动作为输入，并输出下一个合理的对象。要训练MIME，我们通过使用3D人类填充3D前场景数据集来构建数据集。我们的实验表明，与最近不知道人类运动的生成场景方法相比，MIME产生的3D场景更多样化和合理。代码和数据将在https://mime.is.tue.mpg.de上进行研究。

Generating realistic 3D worlds occupied by moving humans has many applications in games, architecture, and synthetic data creation. But generating such scenes is expensive and labor intensive. Recent work generates human poses and motions given a 3D scene. Here, we take the opposite approach and generate 3D indoor scenes given 3D human motion. Such motions can come from archival motion capture or from IMU sensors worn on the body, effectively turning human movement in a "scanner" of the 3D world. Intuitively, human movement indicates the free-space in a room and human contact indicates surfaces or objects that support activities such as sitting, lying or touching. We propose MIME (Mining Interaction and Movement to infer 3D Environments), which is a generative model of indoor scenes that produces furniture layouts that are consistent with the human movement. MIME uses an auto-regressive transformer architecture that takes the already generated objects in the scene as well as the human motion as input, and outputs the next plausible object. To train MIME, we build a dataset by populating the 3D FRONT scene dataset with 3D humans. Our experiments show that MIME produces more diverse and plausible 3D scenes than a recent generative scene method that does not know about human movement. Code and data will be available for research at https://mime.is.tue.mpg.de.

下载PDF全文

下载文献需遵守相关版权规定

论文标题