Petsgan：重新思考单图像生成

论文标题

Petsgan：重新思考单图像生成

PetsGAN: Rethinking Priors for Single Image Generation

论文作者

Zhang, Zicheng, Liu, Yinglu, Han, Congying, Shi, Hailin, Guo, Tiande, Zhou, Bowen

论文摘要

Singan首先引入了单个图像生成（SIG），被描述为生成与给定单图像相似的视觉内容的各种样品，该样本首先是由Singan引入的，该样品构建了gan的金字塔，以逐步了解单个图像的内部贴片分布。它还在各种图像操纵任务中显示出巨大的潜力。但是，Singan的范式在发电质量和培训时间方面有局限性。首先，由于缺乏高级信息，Singan无法像在场景和纹理图像中那样正确处理对象图像。其次，单独的渐进培训计划是耗时的，易于引起人工制品的积累。为了解决这些问题，在本文中，我们通过内部和外部先验的充分利用来解决SIG问题并改善Singan。本文的主要贡献包括：1）我们向SIG介绍了一个正规的潜在变量模型。据我们所知，这是第一次实现SIG的明确表述和优化目标，并且所有现有的SIG方法都可以视为该模型的特殊情况。 2）我们设计了一种新型的基于先前的端到端训练GAN（Petsgan）来克服Singan的问题。我们的方法摆脱了耗时的渐进培训计划，可以端到端训练。 3）我们构建丰富的定性和定量实验，以显示我们方法对产生的图像质量，多样性和训练速度的优越性。此外，我们将方法应用于其他图像操纵任务（例如样式转移，协调），结果进一步证明了我们方法的有效性和效率。

Single image generation (SIG), described as generating diverse samples that have similar visual content with the given single image, is first introduced by SinGAN which builds a pyramid of GANs to progressively learn the internal patch distribution of the single image. It also shows great potentials in a wide range of image manipulation tasks. However, the paradigm of SinGAN has limitations in terms of generation quality and training time. Firstly, due to the lack of high-level information, SinGAN cannot handle the object images well as it does on the scene and texture images. Secondly, the separate progressive training scheme is time-consuming and easy to cause artifact accumulation. To tackle these problems, in this paper, we dig into the SIG problem and improve SinGAN by fully-utilization of internal and external priors. The main contributions of this paper include: 1) We introduce to SIG a regularized latent variable model. To the best of our knowledge, it is the first time to give a clear formulation and optimization goal of SIG, and all the existing methods for SIG can be regarded as special cases of this model. 2) We design a novel Prior-based end-to-end training GAN (PetsGAN) to overcome the problems of SinGAN. Our method gets rid of the time-consuming progressive training scheme and can be trained end-to-end. 3) We construct abundant qualitative and quantitative experiments to show the superiority of our method on both generated image quality, diversity, and the training speed. Moreover, we apply our method to other image manipulation tasks (e.g., style transfer, harmonization), and the results further prove the effectiveness and efficiency of our method.

下载PDF全文

下载文献需遵守相关版权规定

论文标题