节俭的基于加强的积极学习

论文标题

节俭的基于加强的积极学习

Frugal Reinforcement-based Active Learning

论文作者

Deschamps, Sebastien, Sahbi, Hichem

论文摘要

大多数现有的学习模型，尤其是深层神经网络，都依赖大型数据集，这些数据集的手工标记昂贵且苛刻。当前的趋势是使学习这些模型节俭，并且不太依赖大量标记的数据。在现有的解决方案中，深度积极学习目前正在见证主要兴趣，其目的是训练深层网络使用尽可能少的标签样品。但是，积极学习的成功很大程度上取决于训练模型时这些样本的关键程度。在本文中，我们设计了一种新型的积极学习方法，用于标签有效的培训。提出的方法是迭代的，旨在最大程度地减少混合多样性，代表性和不确定性标准的约束目标函数。所提出的方法是概率，并在单个目标函数中统一了所有这些标准，其解决方案在学习决策功能时将解决方案模拟样本的相关性（即如何至关重要）。我们还基于强化学习引入了一种新颖的加权机制，该机制使用特定的无状态Q学习模型在每个训练迭代中适应这些标准。在包括Object-Dota在内的主食图像分类数据上进行的广泛实验显示了我们提出的模型W.R.T.的有效性。几个基线包括随机，不确定性和平坦以及其他工作。

Most of the existing learning models, particularly deep neural networks, are reliant on large datasets whose hand-labeling is expensive and time demanding. A current trend is to make the learning of these models frugal and less dependent on large collections of labeled data. Among the existing solutions, deep active learning is currently witnessing a major interest and its purpose is to train deep networks using as few labeled samples as possible. However, the success of active learning is highly dependent on how critical are these samples when training models. In this paper, we devise a novel active learning approach for label-efficient training. The proposed method is iterative and aims at minimizing a constrained objective function that mixes diversity, representativity and uncertainty criteria. The proposed approach is probabilistic and unifies all these criteria in a single objective function whose solution models the probability of relevance of samples (i.e., how critical) when learning a decision function. We also introduce a novel weighting mechanism based on reinforcement learning, which adaptively balances these criteria at each training iteration, using a particular stateless Q-learning model. Extensive experiments conducted on staple image classification data, including Object-DOTA, show the effectiveness of our proposed model w.r.t. several baselines including random, uncertainty and flat as well as other work.

下载PDF全文

下载文献需遵守相关版权规定

论文标题