Balagan：通过跨模式转移之间的不平衡域之间的图像翻译

论文标题

Balagan：通过跨模式转移之间的不平衡域之间的图像翻译

BalaGAN: Image Translation Between Imbalanced Domains via Cross-Modal Transfer

论文作者

Patashnik, Or, Danon, Dov, Zhang, Hao, Cohen-Or, Daniel

论文摘要

最先进的图像到图像翻译方法往往会在不平衡的域设置中挣扎，在这种情况下，一个图像域缺乏丰富性和多样性。我们介绍了一个新的无监督翻译网络Balagan，该网络专门设计用于解决域失衡问题。我们利用富裕域的潜在方式将图像到图像翻译问题在两个不平衡的域之间变成平衡，多类和有条件的翻译问题，更像样式传输设置。具体来说，我们在没有任何监督的情况下分析了源域并将其分解为一组潜在模式或类。这使我们在包括目标域在内的所有类之间都有了许多平衡的跨域翻译任务。在推断期间，训练有素的网络将作为输入源图像以及来自其中一种模式的参考或样式图像作为条件，并产生类似于像素级别的源的图像，但与参考具有相同的模式。我们表明，在数据集中采用模式可以提高翻译图像的质量，并且在图像质量和多样性方面，Balagan优于无条件和基于样式转移的图像到图像翻译方法的强大基线。

State-of-the-art image-to-image translation methods tend to struggle in an imbalanced domain setting, where one image domain lacks richness and diversity. We introduce a new unsupervised translation network, BalaGAN, specifically designed to tackle the domain imbalance problem. We leverage the latent modalities of the richer domain to turn the image-to-image translation problem, between two imbalanced domains, into a balanced, multi-class, and conditional translation problem, more resembling the style transfer setting. Specifically, we analyze the source domain and learn a decomposition of it into a set of latent modes or classes, without any supervision. This leaves us with a multitude of balanced cross-domain translation tasks, between all pairs of classes, including the target domain. During inference, the trained network takes as input a source image, as well as a reference or style image from one of the modes as a condition, and produces an image which resembles the source on the pixel-wise level, but shares the same mode as the reference. We show that employing modalities within the dataset improves the quality of the translated images, and that BalaGAN outperforms strong baselines of both unconditioned and style-transfer-based image-to-image translation methods, in terms of image quality and diversity.

下载PDF全文

下载文献需遵守相关版权规定

论文标题