论文标题
Vae-Info-Can:通过组合像素级和特征级地理空间条件输入来生成合成图像
VAE-Info-cGAN: Generating Synthetic Images by Combining Pixel-level and Feature-level Geospatial Conditional Inputs
论文作者
论文摘要
由于缺乏班级平衡和多样化的培训数据,很难培训对计算机视觉的许多地理空间应用的强大监督深度学习模型。相反,为许多应用程序获得足够的培训数据在财务上是不可行的,或者可能是不可行的,尤其是当应用涉及对罕见或极端事件进行建模时。使用可以从目标分布中采样并利用图像的多尺度性质的生成模型来合成生成数据(和标签)可以是解决标记数据稀缺性的廉价解决方案。为了实现这一目标,我们提出了一个称为Vae-Info-Can的深层条件生成模型,该模型将变异自动编码器(VAE)与有条件的信息结合在一起,以最大程度地提高生成对抗性网络(INFOGAN),以合成语义丰富的图像,同时在像素级别的条件(PLC)和宏观的功能(flc)上(flc)。在尺寸上,PLC只能在综合图像中的通道维度中变化,并且本来可以是特定于任务的输入。 FLC被建模为生成图像的潜在空间中的属性向量,该属性属性控制了各种特征属性对目标分布的贡献。通过改变所选的二进制宏观特征,对属性向量对系统生成合成图像的解释进行了探索。 GPS轨迹数据集上的实验表明,所提出的模型可以准确地在不同地理位置上跨不同地理位置生成各种形式的时空聚集体,同时仅在道路网络的栅格表示上进行条件。 VAE-INFO-CAN的主要预期应用是合成数据(和标签)生成的目标数据增强,用于基于计算机视觉的基于与地理空间分析和遥感有关的问题的建模。
Training robust supervised deep learning models for many geospatial applications of computer vision is difficult due to dearth of class-balanced and diverse training data. Conversely, obtaining enough training data for many applications is financially prohibitive or may be infeasible, especially when the application involves modeling rare or extreme events. Synthetically generating data (and labels) using a generative model that can sample from a target distribution and exploit the multi-scale nature of images can be an inexpensive solution to address scarcity of labeled data. Towards this goal, we present a deep conditional generative model, called VAE-Info-cGAN, that combines a Variational Autoencoder (VAE) with a conditional Information Maximizing Generative Adversarial Network (InfoGAN), for synthesizing semantically rich images simultaneously conditioned on a pixel-level condition (PLC) and a macroscopic feature-level condition (FLC). Dimensionally, the PLC can only vary in the channel dimension from the synthesized image and is meant to be a task-specific input. The FLC is modeled as an attribute vector in the latent space of the generated image which controls the contributions of various characteristic attributes germane to the target distribution. An interpretation of the attribute vector to systematically generate synthetic images by varying a chosen binary macroscopic feature is explored. Experiments on a GPS trajectories dataset show that the proposed model can accurately generate various forms of spatio-temporal aggregates across different geographic locations while conditioned only on a raster representation of the road network. The primary intended application of the VAE-Info-cGAN is synthetic data (and label) generation for targeted data augmentation for computer vision-based modeling of problems relevant to geospatial analysis and remote sensing.