微笑：语义引导的多属性图像和布局编辑

论文标题

微笑：语义引导的多属性图像和布局编辑

SMILE: Semantically-guided Multi-attribute Image and Layout Editing

论文作者

Romero, Andrés, Van Gool, Luc, Timofte, Radu

论文摘要

自引入生成对抗网络（GAN）以来，属性图像操纵一直是一个非常活跃的主题。由于面部图像的多个和相互包容性的性质，探索转换内部的属性空间是一项非常具有挑战性的任务，其中不同的标签（眼镜，帽子，头发，身份等）可以同时共存。几项工作是通过使用条件随机向量噪声来利用每个域/属性的模态，或从示例图像中提取模态来解决此问题。但是，现有方法无法处理多个属性的随机转换和参考转换，这限制了解决方案的一般性。在本文中，我们成功利用了处理所有属性的多模式表示，无论是由随机噪声还是示例图像引导，而仅使用目标域的基础域信息。我们为面部数据集提供了广泛的定性和定量结果，以及显示我们方法优越性的几种不同属性。此外，我们的方法能够通过使用图像作为参考或探索样式的分配空间来添加，删除或更改细粒度或粗属性，并且可以轻松地扩展到头部扫描和面部制作应用程序，而无需对视频进行培训。

Attribute image manipulation has been a very active topic since the introduction of Generative Adversarial Networks (GANs). Exploring the disentangled attribute space within a transformation is a very challenging task due to the multiple and mutually-inclusive nature of the facial images, where different labels (eyeglasses, hats, hair, identity, etc.) can co-exist at the same time. Several works address this issue either by exploiting the modality of each domain/attribute using a conditional random vector noise, or extracting the modality from an exemplary image. However, existing methods cannot handle both random and reference transformations for multiple attributes, which limits the generality of the solutions. In this paper, we successfully exploit a multimodal representation that handles all attributes, be it guided by random noise or exemplar images, while only using the underlying domain information of the target domain. We present extensive qualitative and quantitative results for facial datasets and several different attributes that show the superiority of our method. Additionally, our method is capable of adding, removing or changing either fine-grained or coarse attributes by using an image as a reference or by exploring the style distribution space, and it can be easily extended to head-swapping and face-reenactment applications without being trained on videos.

下载PDF全文

下载文献需遵守相关版权规定

论文标题