论文标题

Treegan:将阶级层次结构纳入图像生成

TreeGAN: Incorporating Class Hierarchy into Image Generation

论文作者

Zhang, Ruisi, Mou, Luntian, Xie, Pengtao

论文摘要

有条件的图像生成(CIG)是计算机视觉和机器学习中广泛研究的问题。给定一个类,CIG将此类的名称作为输入,并生成一组属于此类的图像。在现有的CIG作品中,对于不同的类别,它们的相应图像是独立生成的,而无需考虑类之间的关系。在现实世界应用中,这些类被组织为层次结构,其层次关系对于生成高保真图像的信息丰富。在本文中,我们旨在利用类层次结构来创造有条件的图像。我们提出了两种合并阶级层次结构的方法:事先控制和后约束。在先前的控件中,我们首先对类层次结构进行编码,然后将其作为先验将其送入条件发生器以生成图像。在生成图像后,我们测量了它们与类层次结构的一致性,并使用一致性得分来指导发电机的训练。基于这两个想法,我们提出了一个由三个模块组成的Treegan模型:(1)类层次结构编码器(CHE),该类别将类及其文本名称的层次结构作为输入,并学习每个类的嵌入;嵌入捕获了阶级之间的等级关系; (2)有条件的图像发生器(CIG),将类别的CHE生成的嵌入为输入,并生成一组属于此类的图像; (3)在生成的图像上执行层次分类的一致性检查器,并检查生成的图像是否与类层次结构兼容;一致性得分用于指导CIG生成层次结构兼容的图像。各种数据集上的实验证明了我们方法的有效性。

Conditional image generation (CIG) is a widely studied problem in computer vision and machine learning. Given a class, CIG takes the name of this class as input and generates a set of images that belong to this class. In existing CIG works, for different classes, their corresponding images are generated independently, without considering the relationship among classes. In real-world applications, the classes are organized into a hierarchy and their hierarchical relationships are informative for generating high-fidelity images. In this paper, we aim to leverage the class hierarchy for conditional image generation. We propose two ways of incorporating class hierarchy: prior control and post constraint. In prior control, we first encode the class hierarchy, then feed it as a prior into the conditional generator to generate images. In post constraint, after the images are generated, we measure their consistency with the class hierarchy and use the consistency score to guide the training of the generator. Based on these two ideas, we propose a TreeGAN model which consists of three modules: (1) a class hierarchy encoder (CHE) which takes the hierarchical structure of classes and their textual names as inputs and learns an embedding for each class; the embedding captures the hierarchical relationship among classes; (2) a conditional image generator (CIG) which takes the CHE-generated embedding of a class as input and generates a set of images belonging to this class; (3) a consistency checker which performs hierarchical classification on the generated images and checks whether the generated images are compatible with the class hierarchy; the consistency score is used to guide the CIG to generate hierarchy-compatible images. Experiments on various datasets demonstrate the effectiveness of our method.

扫码加入交流群

加入微信交流群

微信交流群二维码

扫码加入学术交流群,获取更多资源