论文标题
在深度无监督的积极学习中
On Deep Unsupervised Active Learning
论文作者
论文摘要
近年来,无监督的积极学习吸引了越来越多的关注,其目标是在无监督的人类注释的环境中选择代表性样本。大多数现有的作品都是基于浅线性模型,假设每个样品可以通过某些选定的样品的跨度(即所有线性组合的集合(即所有线性组合的集合)进行近似,然后将这些选定的样品作为标记的代表性样本。但是,实际上,数据不一定符合线性模型,以及如何建模数据的非线性通常成为成功的关键点。在本文中,我们提出了一个新颖的深度神经网络框架,用于无监督的主动学习,称为双重学习。 Dual可以明确地学习非线性嵌入,以通过编码器架构将每个输入映射到潜在空间中,并引入一个选择块,以在学到的潜在空间中选择代表性样本。在选择块中,双重考虑同时保留整个输入模式以及数据集群结构。与最先进的六个公开数据集对六个公开数据集进行了广泛的实验,实验结果清楚地证明了我们方法的功效。
Unsupervised active learning has attracted increasing attention in recent years, where its goal is to select representative samples in an unsupervised setting for human annotating. Most existing works are based on shallow linear models by assuming that each sample can be well approximated by the span (i.e., the set of all linear combinations) of certain selected samples, and then take these selected samples as representative ones to label. However, in practice, the data do not necessarily conform to linear models, and how to model nonlinearity of data often becomes the key point to success. In this paper, we present a novel Deep neural network framework for Unsupervised Active Learning, called DUAL. DUAL can explicitly learn a nonlinear embedding to map each input into a latent space through an encoder-decoder architecture, and introduce a selection block to select representative samples in the the learnt latent space. In the selection block, DUAL considers to simultaneously preserve the whole input patterns as well as the cluster structure of data. Extensive experiments are performed on six publicly available datasets, and experimental results clearly demonstrate the efficacy of our method, compared with state-of-the-arts.